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Abstract 


This  thesis  is  about  the  theory  and  practice  of  intensional  semantics.  Traditional  denotational 
models  of  programming  languages  are  usually  extensional  in  that  they  concern  themselves  only  with 
input/output  properties  of  programs.  The  meaning  of  a  program  is  typically  taken  to  be  a  function 
from  input  to  output  containing  no  information  about  the  way  that  function  computes  its  result. 
In  an  intensional  denotational  semantics,  the  meaning  of  a  program  is  an  object  embodying  aspects 
of  the  computation  strategy.  The  structure  of  the  object  varies,  depending  on  the  language  one 
models  and  the  intended  usage.  For  instance,  previous  intensional  semantics  have  been  developed 
using  functions  on  richer  domains,  pairs  of  a  function  and  a  computation  strategy,  and  sequential 
algorithms,  and  they  were  used  to  reason  about  efficiency,  complexity,  order  of  evaluation,  degrees 
of  parallelism,  efficiency-improving  program  transformations,  and  so  on. 

In  the  first  part  of  this  thesis,  we  develop  an  intensional  semantics  based  on  abstract  circuits. 
A  program  is  mapped  to  a  circuit,  whose  dimensions  tell  us  how  much  parallel  work  and  time  is 
required  to  execute  the  program.  We  relate  the  circuit  dimensions  to  various  execution  strategies, 
and  to  more  traditional  models  of  parallel  execution,  such  as  the  PRAM.  Our  main  application 
for  circuit  semantics  is  the  establishment  of  relative  intensional  expressiveness  results.  Extensional 
expressiveness  is  concerned  with  whether  a  construct  enables  us  to  compute  new  functions.  Since 
most  programming  languages  are  Turing-complete  this  is  usually  not  very  interesting.  On  the  other 
hand,  intensional  expressiveness  is  concerned  with  whether  a  construct  enables  us  to  write  more 
efficient  programs.  Utilizing  a  somewhat  surprising  connection  with  the  field  of  circuit  complexity, 
we  study  the  relative  intensional  expressive  power  of  various  deterministic  and  nondeterministic 
parallel  extensions  of  PCF. 

Although  most  of  our  results  have  to  do  with  parallel  programming  languages,  we  also  study 
relative  intensional  expressiveness  in  a  sequential  setting.  Using  techniques  different  from  circuit 
semantics,  we  compare  Colson’s  primitive  recursive  algorithms  to  Berry  and  Curien’s  sequential 
algorithms,  in  the  area  of  efficient  expressibility  of  a  function  that  computes  the  minimum  of  two 
lazy  natural  numbers. 

In  the  second  part  of  this  thesis,  we  establish  the  practical  utility  of  intensional  semantics,  by 
taking  an  existing  semantics,  that  of  sequential  algorithms  on  concrete  data  structures,  and  using 
it  to  develop  a  refinement  type  inference  system.  The  system  features  recursive  types,  subtyping, 
intersection  types,  polymorphism,  and  overloading.  The  types  are  the  concrete  data  structures, 
and  the  terms  are  expressions  in  a  lazy,  higher-order,  polymorphic,  functional  language,  which  are 
compiled  to  categorical  combinators  represented  by  sequential  algorithms.  A  type  may  be  refined 
by  several  subtypes  (for  instance,  bool  can  be  refined  by  true  and  false).  The  type  always  differs 
from  its  refinements  at  a  finite  number  of  points.  If  a  term  has  a  regular  type,  then  the  system 
enters  into  an  interrogative  abstract  interpretation  session  with  it,  seeking  to  evaluate  it  only  at 
those  points  relevant  from  the  point  of  view  of  refinement  type  inference.  Sequential  algorithms 
provide  very  precise  information  about  the  dependence  of  pieces  of  output  on  pieces  of  input,  and 
we  can  use  this  intensional  information  to  generate  a  refinement  type.  We  prove  soundness  of  both 
the  type  inference  and  refinement  type  inference,  and  we  show  several  examples  from  our  prototype 
implementation. 


To  my  father  and  to  the  memory  of  my  grandfather 
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Chapter  1 

Introduction 


This  thesis  explores  theoretical  and  practical  issues  in  the  semantics  of  programming  languages. 
On  the  theoretical  side,  we  compare  various  sequential  and  parallel  programming  languages,  with 
the  aim  of  establishing  when  one  allows  us  to  write  more  efficient  programs  than  another.  We 
call  this  pursuit  relative  intensional  expressiveness ,  and  the  main  tool  we  use  to  achieve  results  is 
intensional  semantics .  Generally  speaking,  an  intensional  semantics  is  any  semantics  mapping  a 
program  into  an  object  which  provides  some  insight  into  the  way  the  program  computes  its  result, 
that  is,  the  computation  strategy  of  the  program. 

Since  an  intensional  semantics  provides  information  about  the  computation  strategy  of  a  pro¬ 
gram,  the  question  naturally  arises  of  how  to  take  advantage  of  such  information  for  the  purpose 
of  program  analysis.  On  the  practical  side  of  this  thesis,  we  show  how  to  do  this  by  designing  a 
refinement  type  inference  system  using  sequential  algorithms  on  concrete  data  structures. 

Before  going  any  further,  we  shall  describe  in  more  detail  what  we  mean  by  intensional  seman¬ 
tics,  relative  intensional  expressiveness,  and  refinement  types. 


1.1  Intensional  semantics 

The  word  intension  is  a  loaded  one  in  computer  science  in  general,  and  even  in  the  area  of  pro¬ 
gramming  languages  in  particular.  First,  we  give  a  brief  description  of  the  various  usages  of  the 
word,  pointing  out  the  intended  one  in  this  work,  followed  by  a  discussion  of  intensional  semantics 
proper,  and  a  simple  example. 

1.1.1  The  extension  of  intension 

The  word  originated  in  philosophy:  intension  is  the  set  of  all  attributes  thought  of  as  essential 
to  the  meaning  of  a  term,  as  opposed  to  extension ,  which  is  the  set  of  objects  to  which  a  term 
applies.  It  is  used  in  logic:  intensional  logic  is  the  branch  of  logic  concerned  with  assertions  whose 
meaning  is  dependent  on  an  implicit  context.  The  logic  usage  led  to  one  of  the  meanings  in  the 
programming  languages  community:  an  intensional  programming  language  is  one  with  context 
dependent  constructs  (for  instance,  a  notion  of  execution  time  step).  The  first  such  language  was 
Lucid,  occurring  in  both  sequential  [31]  and  parallel  [4]  flavors. 

Another  meaning  of  intension/extension  is  the  opposition  between  the  function- as- a-program  / 
function-as-a-graph  views.  It  is  used  this  way  in  recursion  theory  and  proof  theory,  and  this  is  the 
intended  meaning  here.  An  intensional  programming  language  is  one  with  constructs  which  make 
explicit  intensional  properties,  such  as  order  of  evaluation,  degree  of  parallelism,  etc.  An  example  of 
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such  a  language  is  Berry  and  Curien’s  CDSO  [7],  a  programming  language  of  sequential  algorithms 
on  concrete  data  structures. 

1.1.2  What  is  intensional  semantics? 

Traditionally,  denotational  semantics  has  mainly  been  used  to  reason  about  extensional  properties 
of  programs.  The  meaning  of  a  program  is  typically  taken  to  be  a  function  from  input  to  out¬ 
put  conveying  no  information  about  the  way  that  function  computes  its  result.  For  instance,  two 
sorting  programs,  such  as  bubblesort  and  mergesort,  are  mapped  by  an  extensional  semantics  to 
the  same  input/output  function,  the  function  that  sorts  its  input.  However,  the  two  programs  are 
very  different  in  terms  of  efficiency.  This  is  an  intensional  feature.  In  an  intensional  denotational 
semantics,  the  meaning  of  a  program  is  an  object  embodying  aspects  of  the  program’s  computation 
strategy ,  i.e.,  the  way  the  program  computes  its  result,  and  thus  by  choosing  an  appropriate  inten¬ 
sional  model,  bubblesort  and  mergesort  can  be  differentiated.  Ideally,  one  would  like  to  be  able  to 
use  an  intensional  semantics  to  establish  relative  efficiency  results,  e.g.,  mergesort  is  “better”  than 
bubblesort  in  average  or  worst  case. 

There  are  many  ways  of  constructing  intensional  denotational  models.  We  outline  just  a  few 
possibilities: 

•  We  could  take  the  meaning  of  a  program  to  be  a  function  on  a  richer  domain  ( e.g.,  [11,  19]) 
whose  structure  permits  us  to  deduce  information  about  computation  strategy.  This  is  usually 
achieved  by  introducing  partially  defined  elements  in  the  model;  by  knowing  what  our  program 
does  on  partial  inputs,  we  can  get  an  idea  of  the  evaluation  strategy. 

•  We  could  take  the  meaning  to  be  a  pair  consisting  of  a  function  and  an  object  conveying 
intensional  information;  this  object  could  represent  the  cost  of  evaluating  the  function,  or 
could  be  a  function  from  inputs  to  costs  ( e.g .,  [44,  77]). 

•  We  could  dispense  with  functions  as  meanings  altogether,  and  use  algorithms  instead  (e.g.,  the 
Berry- Cur ien  intensional  model  for  sequential  languages  using  sequential  algorithms  on  con¬ 
crete  data  structures  [6]). 

An  important  point  to  note  is  that  intensionality  is  relative.  A  semantics  can  be  more  intensional 
than  another  one.  For  each  extensional  semantics  there  is  a  hierarchy  of  intensional  semantics  that 
add  more  and  more  information.  Our  choice  of  an  intensional  semantics  should  be  dependent  on 
what  program  properties  we  wish  to  reason  about,  i.e.,  we  should  be  able  to  pick  the  relevant 
amount  of  detail  for  the  problem  at  hand. 

When  one  is  not  interested  in  the  intensional  aspects  of  program  behavior,  the  intensional 
model  should  agree  with  the  extensional  one.  In  other  words,  one  should  be  able  to  throw  away  the 
extra  detail  in  an  intensional  model  (e.g.,  the  computation  strategy)  and  have  it  collapse  onto  an 
extensional  model.  If  our  intensional  models  are  such  “conservative  extensions”  of  the  extensional 
one,  then  we  could  reason  about  both  intensional  and  extensional  aspects  at  the  same  time. 

We  will  be  referring  to  an  intensional  denotational  semantics  simply  as  an  intensional  seman¬ 
tics,  although  in  general,  an  intensional  semantics  is  any  semantics  which  enables  one  to  reason 
about  intensional  features.  In  particular,  operational  semantics  has  also  been  used  to  reason  about 
intensional  issues  [75,  9,  41].  We  are  particularly  interested  in  denotational  models  because  they 
are  defined  compositionally  and  permit  algebraic  reasoning  (to  show  that  two  programs  have  the 
same  meaning,  we  need  only  show  that  they  have  the  same  denotation),  and  they  enable  the  use 
of  well-known  techniques  for  reasoning  about  programs,  such  as  fixed-point  induction  [80]. 
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Figure  1.1:  The  lazy  natural  numbers 


1.1*3  An  example:  Primitive  recursion  and  the  lazy  natural  numbers 

For  a  simple  example  of  an  intensional  semantics,  consider  the  semantics  of  primitive  recursive 
(VIZ)  algorithms.  VIZ  algorithms  are  just  syntax  for  expressing  VIZ  functions  [57].  The  syntax  is 
in  the  form  of  a  rewrite  system  obeying  certain  syntactic  restrictions  corresponding  to  the  format 
of  primitive  recursive  function  definitions  (see  Colson  [19,  20]  for  a  formal  definition).  The  VIZ 
algorithms  operate  on  integers  in  unary  representation,  denoted  by  0,  5(0),  and  so  on,  where  5 
stands  for  successor.  Consider  the  following  two  algorithms  for  integer  addition  [20]: 

addl{0,y)  =  y 

addl(S(x),  y)  =  S(addl(x,y)) 


add2(x,  0)  =  x 

add2(x,S(y))  =  S(add2(x,y)) 

The  standard  extensional  denotational  semantics  for  addl,  add2  maps  them  both  into  the 
addition  function  of  type  N 2  -4  AT,  where  N  is  the  flat  domain  of  natural  numbers.  A  simple 
intensional  semantics  may  be  provided  by  using  the  lazy  natural  numbers  [19,  20,  22].  The  domain 
LNAT  is  shown  in  Figure  1.1.  LNAT  captures  the  temporal  aspect  of  finding  out  what  an  input  is. 
At  Sk(-L)  we  don’t  know  yet  if  we  have  the  number  5^(0),  or  something  larger  (at  least  5fc+1(J-)). 
This  intensional  semantics  is  sufficient  to  distinguish  between  the  two  addition  algorithms.  Using 
the  meaning  function  [  ]  from  [20,  22]  (which  makes  the  meaning  _L  when  an  algorithm  tries  to 
recur  on  ±)  we  have: 

laddl}(S2(±),S(±))  =  S2(±) 
ladd2}(S2(±),S(l.))  =  S(±.) 

The  LNAT  semantics  is  richer  than  the  N  semantics,  and  contains  intensional  information;  the 
above  equations  can  be  interpreted  as  showing  that  at  some  point,  add2  tries  to  evaluate  part  of  its 
second  argument  before  the  first,  whereas  addl  looks  at  its  first  input  first.  Although  the  LNAT 
semantics  still  represents  the  meanings  of  addl  and  add2  as  functions  (from  LNAT  x  LNAT  to 
LNAT),  it  conveys  implicit  information  about  the  computation  strategy  of  the  related  functions 
from  N 2  to  N. 
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Colson  used  the  LNAT  semantics  to  study  the  efficient  expressibility  of  a  function  that  computes 
the  minimum  of  two  natural  numbers  represented  in  unary  notation.  Although  the  semantics 
appears  quite  simple,  it  was  enough  to  allow  Colson  to  prove  a  rather  surprising  impossibility 
result:  VIZ  algorithms  cannot  compute  minimum  efficiently.  We  shall  present  his  work  in  more 
detail  in  the  next  chapter.  The  LNAT  semantics  will  also  appear  in  our  work,  when  we  conduct 
our  own  study  of  the  efficient  expressibility  of  a  minimum  function  in  the  context  of  sequential 
algorithms  on  concrete  data  structures,  and  their  generalization  to  parallel  algorithms. 


1.2  Relative  intensional  expressiveness 

In  the  first  half  of  this  thesis,  we  are  interested  in  establishing  relative  intensional  expressiveness 
results  for  programming  languages.  Most  work  in  the  past  has  focused  on  extensional  expressive¬ 
ness:  Language  L\  is  extensionally  more  expressive  than  L2  if  L\  can  compute  all  the  functions 
computable  in  L2 .  We  say  that  language  L\  is  intensionally  more  expressive  than  L2,  if  L\  can 
compute  all  the  functions  computable  in  L2,  with  at  least  the  same  asymptotic  complexity.  The 
notions  of  complexity  we  concentrate  on  are  time  and  work.  Note  that  there  has  been  a  lot  of 
work  comparing  the  intensional  expressiveness  of  different  models  of  computation.  For  instance, 
allowing  only  a  single  tape  for  a  Turing  machine  can  square  the  time  necessary  to  recognize  a 
language  versus  a  two-tape  Turing  machine  [48];  and  there  are  certain  problems  for  which  there 
exist  faster  CRCW  PRAM  algorithms  than  EREW  PRAM  algorithms  [23].  Our  goal  is  to  compare 
programming  languages,  not  their  underlying  computation  models.  We  shall  be  careful  to  point 
out  when  we  need  to  make  special  assumptions  about  the  computation  model  in  order  to  achieve 
our  programming  language  comparisons. 

It  would  appear  that  there  should  be  close  connections  between  relative  intensional  expressive¬ 
ness  and  complexity  theory.  Indeed,  there  has  been  some  work  on  machine-independent  character¬ 
izations  of  complexity  classes.  A  long  time  ago,  Cobham  [18]  characterized  P  as  a  language  similar 
to  primitive  recursive  algorithms.  Very  recently,  Clote  [17]  did  the  same  for  NC,  which  can  be 
viewed  as  the  class  of  functions  that  can  be  computed  “quickly”  in  parallel.  The  characterization 
of  NC  also  takes  the  form  of  a  variant  of  primitive  recursive  algorithms.  One  of  the  main  prob¬ 
lems  of  complexity  theory,  P  versus  NC,  can  then  be  viewed  as  a  problem  of  relative  intensional 
expressiveness. 

Quite  obviously,  we  should  not  expect  the  act  of  viewing  a  problem  as  a  relative  intensional 
expressiveness  problem  on  programming  languages  to  make  it  easier.  Proving  negative  results  and 
lower  bounds  is  difficult,  no  matter  how  one  looks  at  it.  It  should  be  interesting  to  see,  however, 
if  any  useful  new  ideas  emerge  at  the  interface  of  programming  languages  theory  and  complexity 
theory.  We  hope  that  our  work  can  be  seen  as  a  small  step  in  this  direction. 

We  compare  both  sequential  and  parallel  languages.  First  we  examine  the  efficient  expressibility 
of  minimum  on  lazy  natural  numbers  in  CDSO  and  in  a  parallel  extension  of  CDSO  we  call  CDSP, 
and  contrast  that  to  Colson’s  results  with  VI Z  algorithms.  Then  we  compare  four  deterministic 
parallel  extensions  of  PCF  [70],  which  is  the  prototypical  sequential  functional  language.  Finally, 
we  compare  a  deterministic  and  a  nondeterministic  extension  of  PCF.  To  aid  us  in  the  comparisons 
of  PCF  extensions,  we  introduce  a  new  intensional  semantics  called  circuit  semantics.  Circuit 
semantics  associates  a  gate  with  each  basic  construct  of  the  language,  and  takes  the  meaning  of  a 
program  to  be  a  circuit.  The  dimensions  of  the  circuit  enable  reasoning  about  running  time  and 
work  required  for  execution.  Circuit  semantics  also  allows  us  to  formalize  a  connection  between 
deterministic  and  nondeterministic  parallel  PCF  programs,  and  monotone  and  De  Morgan  boolean 
circuits  [87],  respectively. 
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1.3  Refinement  types 

The  idea  of  refinement  types  is  due  to  Freeman  and  Pfenning  [35,  36].  In  their  work,  the  programmer 
may  choose  to  decompose  a  type  into  a  collection  of  subtypes  by  means  of  a  recursive  datatype 
declaration.  The  subtypes  are  called  refinements  of  the  original  type.  They  also  developed  a  type 
inference  system  which  first  obtains  a  regular  type  (not  involving  refinements)  for  a  program,  then 
attempts  to  obtain  a  refinement  type  for  it  by  means  of  refinement  type  inference  rules.  The 
intended  use  for  the  system  was  as  a  programmer’s  aid  in  eliminating  spurious  warnings  generated 
by  the  Standard  ML  type  inference  for  missing  patterns  that  were  actually  unreachable. 

We  think  that  the  idea  of  refinement  types  is  a  very  interesting  one,  but  our  approach  has  a  very 
different  flavor.  We  are  interested  in  program  analysis  and  we  do  not  have  refinement  type  inference 
rules.  Instead  we  perform  an  abstract  interpretation  on  the  program  directly  (instead  of  doing  it 
at  the  level  of  types).  As  in  Freeman  and  Pfenning,  the  programmer  has  to  specify  refinements, 
and  we  rely  on  the  program  to  have  a  regular  type  before  trying  to  generate  a  refinement  type  for 
it.  We  shall  have  more  to  say  about  differences  between  the  two  systems  later,  when  we  present 
our  approach  in  detail.  For  now,  we  wish  to  illustrate  the  basic  idea  with  some  examples. 

Suppose  we  have  a  generic,  lazy  functional  language  with  a  syntax  similar  to  that  of  Standard 
ML  (the  examples  below  are  actual  programs  from  our  implementation) .  Suppose  further  that  we 
decide  to  distinguish  between  true  and  false ,  i.e.,  we  want  to  refine  bool.  We  would  expect  the 
following  program  with  regular  type  bool  ->  bool: 

val  not  -  fn  x  =>  if  x  then  false  else  true; 

to  have  refinement  type  true  false  A  false  — >  true,  where  A  denotes  intersection  of  types.  The 
intuitive  meaning  is  that  the  program  not  has  both  types  true  — >  false  and  falser  true. 

Something  more  interesting  happens  when  we  decide  to  refine  a  recursive  type.  Suppose  we 
have  integer  lists  (intlist),  and  we  want  to  distinguish  between  empty  lists  (empty  .intlist),  lists 
with  one  element  (oneJntlist),  and  lists  of  two  or  more  elements  ( manyJntlist ).  Then  we  would 
expect  the  map  function  of  regular  type  (int  — int)  intlist  — >  intlist : 

val  map  =  let re c  mapf  = 

fn  f  =>  fn  1  =>  if  null  1  then  [] 

else  (f  (hd  1))  ::  ((mapf  f)  (tl  1)) 

in  mapf 
end; 

to  have  the  following  refinement  type: 

(int  — >  int )  — >  empty  .intlist  — >  empty. intlist  A 
(int  ->  int )  one.intlist  — >  one.intlist  A 

(int  — » int)  manyJntlist  — »  manyJntlist. 

We  make  use  of  intensional  semantics  to  help  generate  such  refinement  types  by  translating  the 
programs  to  categorical  combinators  [26],  which  themselves  denote  sequential  algorithms  (i.e.,  CDSO 
programs).  The  types  are  represented  as  concrete  data  structures  [56].  A  type  and  its  refinements 
will  always  be  distinguishable  by  examination  at  a  finite  number  of  points.  So  we  perform  abstract 
interpretation  of  the  CDSO  program  over  the  lattice  of  such  points,  and  use  the  very  precise  in¬ 
formation  on  the  dependence  of  parts  of  the  output  on  parts  of  the  input  provided  by  sequential 
algorithms  to  generate  the  refinement  type. 
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1.4  Claims  of  the  thesis 

The  guiding  principle  behind  this  work  and  the  central  claim  of  the  thesis  is: 

The  exploration  of  intensional  semantics  is  interesting  from  both  a  theoretical  and  practical 
point  of  view. 

More  specifically,  we  claim  the  following: 

•  Definition  of  the  notion  of  relative  intensional  expressiveness  for  programming  languages. 

•  Definition  of  CDSP,  a  parallel  extension  of  CDSO  with  a  query  construct  [12]. 

•  Proof  that  CDSO  is  more  expressive  than  VTZ  algorithms,  but  less  expressive  than  CDSP. 

•  Formalization  of  a  new  intensional  semantics,  circuit  semantics,  and  comparisons  with  par¬ 
allel  evaluation  strategies  [49]:  call-by-speculation,  parallel  call- by- value,  and  parallel  eager 
evaluation. 

•  Identification  of  a  hierarchy  of  intensional  expressiveness  for  deterministic  parallel  extensions 
of  PCF:  parallel  conditional  on  booleans  is  equivalent  to  parallel  or;  both  are  less  expressive 
than  parallel  conditional  on  integers,  which  in  turn  is  less  expressive  than  query . 

•  Formalization  of  a  connection  between  work  and  time  complexity  of  functional  programs  ex¬ 
tended  with  deterministic  and  nondeterministic  query ,  and  monotone  and  De  Morgan  circuits. 
Use  of  this  connection  and  a  hardware  assumption  to  show  that  nondeterministic  query  is 
more  expressive  than  the  deterministic  one. 

•  Development  of  a  type  system  based  on  concrete  data  structures.  Implementation  of  type 
inference  for  CDSO. 

•  Proof  of  soundness  for  both  type  inference  and  refinement  type  inference. 

•  Development  of  a  new  application  of  CDSO. 

•  Implementation  of  a  practical  approach  to  refinement  type  inference. 

1.5  Related  work 

We  consider  separately  related  work  in  the  areas  of  intensional  semantics,  relative  intensional 
expressiveness,  and  refinement  types  and  type  inference  for  CDSO. 

1.5.1  Intensional  semantics 

The  related  work  surveyed  here  is  composed  of  several  different  strands.  The  common  element  is 
a  concern  with  the  analysis  of  intensional  aspects  of  programs.  In  most  cases,  the  programming 
language  is  sequential,  and  the  analysis  is  carried  out  from  an  operational  presentation  of  the 
semantics.  The  notable  exceptions  will  be  pointed  out. 

We  discuss  relevant  work  on  the  following  topics:  intensional  semantic  models  for  programming 
.languages,  intensional  hierarchies,  and  automatic  complexity  analysis.  The  section  is  broken  down 
by  area  (e.g.,  recursion  theory),  rather  than  topic,  to  give  an  idea  of  the  naturality  and  pervasiveness 
of  these  ideas. 
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Recursion  theory 

The  distinction  between  a  function  and  the  algorithm  that  computes  it  was  made  early  on  in  re¬ 
cursion  theory  [76],  but  the  main  focus  of  the  theory  is  on  the  functions,  that  is  on  the  extensional 
features.  Most  results  have  to  do  with  closure  properties  of  various  collections  of  recursive  func¬ 
tions.  However,  a  so-called  “abstract”  recursion  theory  (also  called  theory  of  algorithms)  has  been 
developed,  chiefly  by  Moschovakis  [64,  65,  66],  although  the  ideas  go  back  to  Kleene  and  others. 
In  [64]  Moschovakis  develops  the  foundation  for  the  theory.  The  semantics  of  a  recursive  partial 
function  is  a  set  of  functionals,  called  a  recursor.  It  is  essentially  a  higher-order  functional  program 
defined  by  a  family  of  mutually  recursive  function  definitions.  Intensional  analysis  can  be  performed 
in  an  operational  style  on  the  recursor.  The  possibility  of  implementing  the  language  of  recursors 
as  a  programming  language  called  REC  is  discussed.  More  recent  works  [65,  66]  update  and  elab¬ 
orate  on  the  older  paper.  Algorithms  are  modeled  as  recursors,  which  are  part  of  a  programming 
language  called  FLR  (Formal  Language  of  Recursion).  The  main  thrust  is  in  proving  that  FLR  is 
a  reasonable  language  in  terms  of  including  all  desirable  intensions. 

Proof  theory 

Proof  theory  [38,  39]  has  been  mainly  concerned  with  extensional  aspects,  as  well.  A  series  of 
functional  systems  of  increasing  extensional  expressive  power  has  been  studied:  linear  A-calculus, 
typed  A-calculus,  primitive  recursion,  Godel’s  system  T,  Martin-Lof’s  intuitionistic  type  theory, 
Girard-Reynolds  polymorphic  second-order  A-calculus  (system  F ),  and  the  theory  of  constructions. 
None  of  these  systems  is  Turing-complete;  all  their  programs  terminate. 

Recently  there  has  been  work  on  intensional  aspects  of  some  of  these  functional  systems.  Col¬ 
son’s  work  [19,  20]  with  primitive  recursive  algorithms  was  mentioned  already.  He  also  studied 
system  T  and  system  F.  System  T  can  express  an  efficient  algorithm  for  minimum.  It  is  an  open 
problem  whether  min(n,p)  can  be  written  in  system  F  with  complexity  0(min(n,p)).  The  cur¬ 
rent  best  program  (see  [29])  is  0(min(n,p)log(min(n,p)).  Interestingly,  system  T  appears  to  be 
intensionally  stronger  than  system  F,  even  though  it  is  extensionally  weaker. 

Programming  languages 

There  is  a  large  body  of  literature  devoted  to  automatic  complexity  analysis.  There  are  typically 
two  phases  to  an  automatic  complexity  analysis  system:  deriving  recurrences  for  the  complexity 
of  a  program,  and  solving  them.  Deriving  the  recurrences  is  usually  accomplished  by  constructing 
a  cost  (or  complexity)  function  from  the  program  and  obtaining  from  this  a  function  of  the  input 
size.  The  cost  function  normally  counts  the  number  of  rewrites  in  the  operational  semantics,  plus 
constants  for  the  primitive  operations. 

Most  of  the  work  in  automatic  complexity  analysis  has  been  devoted  to  studying  strict,  sequen¬ 
tial,  first-order,  functional  languages  (the  earliest  examples  are  Wegbreit  [86]  and  Le  Metayer  [60]). 
There  has  also  been  some  effort  in  the  area  of  lazy  first-order  languages  [79,  85].  The  derivation 
of  a  program’s  complexity  is  more  complicated  in  this  setting,  because  only  part  of  an  argument 
might  be  needed.  There  has  also  been  work  with  higher-order  strict  and  lazy  languages  [78].  The 
basic  idea  is  to  construct  cost-closures  so  a  function  can  carry  around  cost  information. 

Several  authors  have  used  profiling  semantics,  i.e.,  operational  semantics  augmented  with  time 
and  work  information,  to  perform  automatic  complexity  analysis  in  a  parallel  setting.  Roe  [75] 
considered  a  parallel  lenient  language,  and  Zimmerman  [89,  90]  a  data-parallel  language.  The 
language  Zimmermann  analyzes  is  a  first-order  parallel  language  with  vectors  and  a  parallel  “for- 
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all”  construct  (and  similar  others)  ranging  over  vectors.  The  approach  is  the  same  as  in  the 
sequential  case:  a  cost  function  is  constructed,  with  the  cost  of  the  parallel  “for-all”  equal  to  some 
constant  plus  the  maximum  cost  of  the  operation  over  each  vector  element. 

Hudak  and  Anderson  [49]  developed  pomsets  as  a  semantics  for  parallel  functional  programs. 
They  were  able  to  distinguish  between  various  evaluation  strategies  (call-by-value,  call-by-name, 
call-by-need,  call- by-speculation).  More  recently,  Blelloch  and  Greiner  [9,  41]  provided  profiling 
semantics  for  parallel  call-by-value  and  call-by-speculation.  Their  aim  in  [9]  was  to  show  that 
good  upper  bounds  for  merging  and  sorting  can  be  obtained  with  an  implicitly  parallel  language. 
The  second  model  [41]  was  used  to  prove  the  efficiency  of  a  particular  implementation  of  call-by- 
speculation.  Both  models  were  related  to  more  traditional  parallel  models  such  as  the  PRAM. 

The  circuit  model  we  develop  in  this  paper  is  most  closely  related  to  call-by-speculation.  The 
differences  are  due  to  the  presence  of  conditionals  in  the  language.  In  contrast  to  earlier  work,  we 
are  interested  in  proving  lower  bounds  and  performing  intensional  comparisons  between  parallel 
languages. 

An  interesting  approach  to  automatic  complexity  analysis,  in  the  setting  of  strict,  sequential, 
first-order  languages,  was  taken  by  Rosendahl  [77].  He  also  constructs  a  time  (or  cost)  function 
from  the  program,  but  in  order  to  talk  about  the  correctness  of  this  time  function,  he  defines  an 
“instrumented”  denotational  semantics  which  returns  a  denotation  and  the  time  complexity.  A 
time-bound  function,  which  gives  an  upper  bound  on  computation  time  for  all  inputs  of  a  certain 
size,  is  derived  by  abstract  interpretation  from  the  time  cost  function. 

Talcott  developed  a  theory  of  intensional  semantics  [83].  Essentially,  the  extraction  of  the 
intensional  information  is  based  on  a  low-level  operational  semantics:  from  a  program  she  constructs 
a  computation  sequence.  Analysis  is  performed  on  the  computation  sequence:  the  time  complexity 
of  a  program  is  the  length  of  its  computation  sequence.  Other  properties  can  be  analyzed,  such  as 
maximum  stack  depth  and  number  of  function  calls. 

Gurr  [44]  extended  denotational  semantics  in  order  to  model  intensional  aspects  (resource  re¬ 
quirements)  of  first-order,  sequential  languages.  In  his  framework,  the  meaning  of  a  program  is  a 
pair  of  the  original  denotation  of  the  program  and  a  map  from  input  values  to  an  object  of  resource 
values.  The  object  of  resource  values  is  modeled  as  a  monoid  (a  semigroup  with  identity).  Time 
and  space  requirements  of  programs  can  be  formulated  in  this  framework.  He  also  studied  the 
derivation  of  exact  and  non-exact  complexities. 

1.5.2  Relative  intensional  expressiveness 

The  only  example  we  are  aware  of  which  compared  the  intensional  expressiveness  of  two  pro¬ 
gramming  languages  is  the  already  mentioned  work  of  Colson  on  the  expressiveness  of  primitive 
recursion. 

There  has  been  little  work  on  comparing  determinism  and  nondeterminism.  Felleisen,  in  his 
theory  of  expressiveness  [32],  defined  a  construct  c  as  more  expressive  than  another  cf  if  the  trans¬ 
lation  of  a  program  using  c  to  one  using  cf  requires  a  global  reorganization  of  the  program.  He 
showed  that  adding  side-effects  to  a  sequential  functional  language  increases  expressive  power. 

The  literature  on  Id  [82],  an  implicitly  parallel  language,  has  produced  practical  examples 
of  comparisons  of  Id’s  purely  functional  core,  the  extension  with  I-structures  (single-assignment 
arrays),  and  the  extension  with  M-structures  (arrays  with  element-level  synchronization). 

There  has  been  notable  work  on  extensional  comparisons  of  merging  primitives  in  dataflow 
networks  [68].  One  of  the  functions  considered  there  is  poll  which  checks,  without  blocking,  if  an 
input  is  present.  We  make  use  of  poll  in  our  work.  However,  we  are  not  aware  of  any  relevant 
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intensional  comparisons  of  parallel  constructs.  The  general  opinion  expressed  in  [62]  (and  echoed 
in  [47])  seems  to  be  that  the  main  advantage  of  nondeterminism  is  in  specifying  a  process. 

There  has  been  some  recent  research  at  the  juncture  of  complexity  theory  and  programming 
languages  theory,  with  broadly  the  same  aim  of  bridging  the  gap  between  these  two  areas  of 
computer  science.  Aside  from  the  machine-independent  characterizations  of  P  and  NC  already 
mentioned,  there  has  been  work  on  the  characterization  of  P  in  terms  of  bounded  linear  logic 
[40]  and  A-calculus  [59].  In  addition,  Jones  has  commenced  a  reconstruction  of  computability  and 
complexity  theory  from  a  programming  languages  perspective  [53,  54]. 

1.5.3  Refinement  types  and  type  inference  for  CDSO 

Pierce  [69]  introduced  JPA,  a  variant  of  system  F  with  intersection  types,  subtyping,  and  bounded 
quantification.  Using  an  explicit  alternation  construct  called  /or,  this  system  can  derive  refine¬ 
ment  types.  Unfortunately  the  system  is  too  powerful;  it  has  explicit  types,  and  type  checking  is 
undecidable. 

Reynolds  developed  the  programming  language  Forsythe  [74],  which  has  intersection  types  and 
subtyping,  but  no  polymorphism.  In  this  system,  an  intersection  type  can  contain  a  mixture  of 
ground  and  higher-order  types. 

There  has  been  much  work  on  type  systems  using  intersection  types  [46].  Such  systems  are 
usually  too  powerful  to  admit  type  inference;  [21]  is  an  exception.  Fuh  and  Mishra  [37]  developed  a 
type  inference  system  which  combines  polymorphism  and  subtyping,  but  does  not  have  intersection 
types. 

There  has  also  been  a  lot  of  work  on  type  systems  based  on  records  (see  [43]  for  several  examples, 
including  type  inference  systems).  Concrete  data  structures  (cds)  are,  in  some  sense,  similar  to 
records;  they  have  cells  (like  fields  in  a  record)  which  can  be  filled  with  values .  In  addition, 
however,  cds  have  accessibility  conditions,  but  this  is  not  essential:  the  notion  of  subtyping  we 
develop  for  cds  is  very  similar  to  that  for  records.  More  important  is  the  fact  that  a  higher-order 
type  in  CDSO  is  also  a  cds,  and  we  can  interactively  ask  questions  about  the  values  of  its  cells. 

Soft  typing  [15]  is  a  type  inference  system  which  includes  polymorphism,  subtyping,  and  union 
types,  and  it  is  designed  for  dynamically-typed  languages;  when  a  program  fails  to  have  a  static 
type,  run-time  checks  are  included.  The  main  thrust  of  this  system  is  to  be  able  to  type  programs 
that  would  normally  be  rejected  by  standard  type  checking. 

Castagna,  Ghelli,  and  Longo  [16]  defined  the  Afe-calculus,  a  calculus  for  overloaded  functions 
with  subtyping.  A  function  can  be  overloaded  with  the  addition  of  new  pieces  of  code.  The  types 
of  the  various  pieces  have  some  consistency  conditions.  In  CDSO,  programs  can  use  generic  cell 
and  value  references  which  can  result  in  overloading.  Our  notion  of  overloading  for  CDSO  types  is 
similar  to  [16],  except  that  we  do  not  build  the  consistency  conditions  into  the  type;  we  adopt  the 
same  notation. 

1.6  Outline 

Chapter  2  describes  the  work  that  we  are  building  upon  most  directly  in  this  thesis.  We  discuss  the 
full  abstraction  problem  for  PCF  and  Kahn  and  Plotkin’s  definition  of  sequentiality  using  concrete 
data  structures.  This  was  the  starting  point  of  Berry  and  Curien’s  work  on  sequential  algorithms 
on  concrete  data  structures  and  its  implementation  as  a  programming  language,  CDSO.  We  spend 
a  fair  amount  of  time  on  CDSO  as  it  features  prominently  in  the  second  part  of  this  thesis.  We 
describe  Brookes  and  Geva’s  work  on  a  parallel  extension  of  CDSO,  which  provides  us  with  one 
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of  the  deterministic  parallel  constructs  we  study.  Hughes  and  Ferguson’s  use  of  CDSO  to  perform 
abstract  interpretation  is  also  covered.  Although  our  approach  is  different,  knowledge  of  the  earlier 
work  is  useful.  Finally,  we  provide  a  brief  exposition  of  Freeman  and  Pfenning’s  work  on  refinement 
types,  and  Colson’s  work  on  intensional  expressiveness. 

Chapter  3  begins  our  relative  intensional  expressiveness  explorations.  Relying  on  Colson’s 
work,  we  show  that  CDSO  is  more  expressive  than  VIZ  algorithms.  Even  though  CDSO  programs 
are  sequential,  they  are  not  “ultimately  obstinate,”  like  the  VIZ  algorithms.  However,  CDSO 
still  cannot  compute  a  natural  version  of  the  minimum  function  on  lazy  natural  numbers.  The 
parallel  extension  CDSP  can  compute  that  function.  In  addition,  CDSP  can  compute  certain  n-ary 
functions  more  efficiently  than  CDSO. 

Circuit  semantics  is  introduced  in  Chapter  4.  Initially,  we  introduce  only  a  naive  version  of 
circuit  semantics  that  can  essentially  only  distinguish  programs  based  on  depth.  This  is  enough 
to  compare  four  deterministic  parallel  extensions  of  PCF  and  separate  them  into  three  levels  of 
intensional  expressiveness.  We  then  commence  a  more  careful  development  of  circuit  semantics, 
comparing  it  to  various  parallel  evaluation  strategies,  and  using  it  to  model  a  deterministic  and 
nondeterministic  parallel  extension  of  PCF.  We  formalize  a  connection  between  the  circuit  dimen¬ 
sions  of  parallel  PCF  programs  and  monotone  and  De  Morgan  boolean  circuits.  Utilizing  strong 
results  from  complexity  theory,  and  assuming  hardware  that  can  detect  undefined  inputs,  we  are 
able  to  prove  an  intensional  separation  of  the  deterministic  and  nondeterministic  construct. 

Chapter  5  marks  the  beginning  of  the  second  part  of  the  thesis.  We  carefully  formalize  a 
type  system  based  on  concrete  data  structures,  that  includes  subtyping  and  intersection  types. 
We  show  the  decidability  of  subtyping  for  ground  concrete  data  structures,  and  we  introduce  a 
type  inference  system,  proving  its  soundness.  Then  we  add  polymorphism  and  overloading  to  the 
language,  showing  how  to  extend  the  subtyping  decision  procedure  and  the  type  inference  system. 
We  prove  soundness  for  the  extended  system. 

Refinement  type  inference  is  presented  in  Chapter  6.  We  define  refinement  types  for  CDSO 
and  show  how  the  intensional  information  present  in  a  sequential  algorithm  can  be  used  to  extract 
a  refinement  type.  We  introduce  a  generic,  lazy  functional  language,  and  show  how  it  can  be 
compiled  to  CDSO.  To  derive  refinement  types  for  expressions  built  up  from  sequential  algorithms, 
we  introduce  a  loop-detecting  evaluator,  and  show  how  we  need  only  evaluate  the  expression  at  a 
certain  (small)  number  of  cells.  We  prove  soundness  of  the  refinement  type  inference. 

Chapter  7  describes  our  prototype  implementation  and  includes  more  examples.  We  outline 
briefly  the  implementation  of  CDSO  itself,  which  is  based  upon  the  work  of  Devin.  Most  of  the 
chapter  is  devoted  to  our  implementation  of  type  inference  and  refinement  type  inference.  We 
provide  examples  showing  the  benefits  and  limitations  of  our  approach. 

Finally,  Chapter  8  looks  back  on  the  thesis  drawing  some  conclusions  and  outlines  areas  for 
possible  future  work. 


Chapter  2 

Background 


This  chapter  surveys  the  work  that  we  will  be  building  upon  most  directly  in  what  follows.  Sec¬ 
tion  2.1  gives  a  brief  history  of  the  full  abstraction  problem  for  PCF.  Section  2.2  discusses  Kahn  and 
Plotkin’s  concrete  data  structures  and  their  formulation  of  a  notion  of  sequentiality,  and  its  use  by 
Berry  and  Curien  in  constructing  an  intensional  semantics  of  sequential  algorithms  for  PCF.  Berry 
and  Curien’s  programming  language  CDSO,  which  is  a  direct  implementation  of  their  intensional 
semantics,  is  described  in  Section  2.3.  In  Section  2.4  we  describe  Brookes  and  Geva’s  extension  of 
Berry  and  Curien’s  work  to  the  setting  of  parallel  algorithms.  Section  2.5  covers  the  only  previous 
work  on  practical  applications  of  sequential  algorithms.  Freeman  and  Pfenning’s  refinement  type 
inference  system  is  described  in  Section  2.6.  Finally,  Colson’s  work  on  the  intensional  expressiveness 
of  primitive  recursive  algorithms  is  discussed  in  Section  2.7. 


2.1  PCF  and  full  abstraction 

When  a  language  is  designed,  the  semantics  which  is  normally  regarded  as  “the  definition”  of  the 
language  is  often  presented  in  an  operational  style,  by  reference  to  the  computations  of  an  abstract 
machine,  or,  in  the  case  of  structural  operational  semantics  [71],  by  a  set  of  rewrite  rules.  This 
leads  to  a  notion  of  program  equivalence  based  on  observability:  two  programs  will  be  considered 
equivalent  if,  when  inserted  into  the  same  context  (intuitively,  a  program  with  a  hole  in  it),  we  get 
the  same  final  result  after  execution,  as  characterized  by  the  abstract  machine  or  by  application  of 
the  rewrite  rules. 

If  we  give  the  language  a  denotational  semantics  as  well,  we  obtain  a  different  notion  of  pro¬ 
gram  equivalence:  two  programs  are  equivalent  if  they  denote  the  same  value.  It  would  be  useful 
if  these  two  notions  of  equivalence  were  identical:  proving  denotational  equivalence  would  then 
automatically  imply  operational  equivalence,  and  vice  versa.  If  that  were  the  case  we  would  say 
that  the  denotational  semantics  is  fully  abstract  with  respect  to  the  operational  semantics  [70]. 
The  formulation  is  worded  this  way  because  the  operational  semantics  is  considered  as  intuitively 
known. 

Unfortunately,  it  turns  out  to  be  rather  difficult,  in  general,  to  design  fully  abstract  denotational 
semantics  [63,  8].  In  the  setting  of  sequential  languages,  there  has  been  a  great  deal  of  effort 
expended  in  discovering  a  fully  abstract  semantics  for  the  language  PCF  (Programming  Computable 
Functions)  [70,  42].  PCF  is  regarded  as  the  “prototypical”  sequential  programming  language.  It 
is  a  typed  A-calculus  with  two  ground  types,  booleans  (o)  and  integers  (j,)  and  the  following  set  of 
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tt,ff  :  o 
n  :  l 

isZero?  :  i— >o 

+1,  —1  :  i  — >  l 

Yu  :  (a  — >  a)  — ►  a 


(truth  values) 
(integers,  n  >  0) 

(boolean  conditional) 
(integer  conditional) 
(one  for  each  a) 


Figure  2.1:  PCF  constants 


Da  tt  MaNu  — >  Mu ,  for  a  =  l,o 

Da  ff  MuNu  -*  iV^,  for  a  =  t,  o 

YuM->M(YuM ) 

((Ax.  M)N)  ->[N/x]M 

M-tM' 

(MN)  ->(M'N) 

(=V  M0)  ^(=V  Mq) 


+1  n— tn  +  1,  for  n  >  0 
—  1  (n  +  1)  — »  n,  for  n  >  0 
is  Zero?  0  — >  tt 

isZero ?  (n  +  1)  — t  jff ,  for  n  >  0 


N~*N' 

(MN)  ->(MN') 


if  M  is  +1,  —1,  isZero? 


Figure  2.2:  Operational  semantics  for  call-by-name  evaluation  of  PCF 


types,  a: 

a  ::=o\i  \  cr->cr 

The  syntax  for  raw  (untyped)  terms  is  given  by  the  grammar: 

M  : :  =  c  |  x  \  Xx.  M  |  MM 

The  constants  traditionally  included  in  the  language  are  shown  in  Figure  2.1.  The  operational 
semantics  for  call-by-name  evaluation  is  shown  in  Figure  2.2.  For  simplicity,  we  blur  the  distinction 
between  numerals  and  integers,  and  use  n  to  denote  both.  We  omit  the  typing  rules,  which  are 
standard. 

The  standard  denotational  semantics  for  PCF  is  given  by  the  semantic  function 
V:  Terms  — >  Environments  — >  IM, 

where  Da  is  a  family  of  domains  which  includes  the  flat  domains  of  booleans  (D^ooi)  and  integers 
(Dint),  and  such  that  DTX^T2  =  [Dri  DT2],  the  continuous  function  space  (see  [70]  for  details). 
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Plotkin  [70]  showed  that  the  standard  denotational  semantics  V  for  PCF  is  not  fully  abstract. 
The  problem  is  that  the  denotational  semantics  is  “finer”  than  the  operational  semantics.  It  makes 
too  many  distinctions.  If  two  programs  are  denotationally  equivalent,  then  they  are  operationally 
equivalent.  The  converse  does  not  hold.  This  happens  because  the  denotational  semantics  contains 
functions  which  are  not  definable  in  the  language,  like  parallel-or  ( por ).  Por  returns  true  if  at  least 
one  of  its  arguments  is  true: 

por  tt  ±  =  tt 
por  L  tt  =  tt 
por  ff  ff  =  ff 

Using  por  we  can  construct  a  program  context  that  distinguishes  denotationally  between  two 
operationally  equivalent  programs.  First,  let  flT  =  YT(Xf.  /).  Then,  let 

ORTEST  =  XL  A/.  Dt  (/  tt  Q0) 

(Dt  (/  D0  tt) 

U  (/  ffff)  Ch  i) 

) 

nt 

Now,  ORTEST  0  is  operationally  equivalent  to  ORTEST  1,  namely  they  both  diverge.  However, 
according  to  the  denotational  semantics, 

P[ORTEST]-L  =  Xv.  A$.  if  §  =  por  then  v  else  _L, 
and  we  have  P  [ORTEST  0]  +  P  [ORTEST  1], 

Attempts  to  solve  this  problem  have  been  aimed  at  eliminating  the  unwanted  functions  from  the 
semantics,  by  restricting  the  continuous  functions  to  “sequential”  functions.  Other  ways  of  solving 
this  are  to  change  the  operational  semantics  or  to  add  primitives  to  PCF.  Plotkin  [70]  showed  that 
the  denotational  semantics  for  PCF  +  por  (referred  to  as  PCFP)  is  fully  abstract.  Cartwright, 
Curien,  and  Felleisen  showed  [14,  26]  that  full  abstraction  can  also  be  obtained  by  extending  PCF 
with  a  catch  primitive  (referred  to  as  PCFC). 

Recently,  the  full  abstraction  problem  for  PCF  has  been  solved,  independently,  by  several  groups 
[2,  52,  67],  Interestingly,  all  these  solutions  are  constructed  in  a  similar  way:  First,  an  intensional 
semantics  based  on  game  semantics  is  constructed.  Then,  the  undesirable  intensional  elements 
are  filtered  out,  leaving  behind  an  extensional  model.  The  most  fascinating  part  is  that  the  game 
semantics  can  be  seen  as  an  elegant  generalization  of  the  work  discussed  in  the  next  section,  and 
upon  which  the  language  CDSO  is  based. 

2.2  Concrete  data  structures 

Some  of  the  early  efforts  at  defining  sequential  functions  were  hampered  by  working  in  a  setting 
where  no  distinction  was  made  between  function  domains  and  domains  of  the  data  they  compute 
on.  In  part  to  alleviate  this,  by  providing  a  model  in  which  it  is  easier  to  formalize  the  notion 
of  incremental  computation,  Kahn  and  Plotkin  [56]  developed  concrete  data  structures,  and  their 
domain-theoretic  version,  the  concrete  domains. 

A  concrete  data  structure  is  like  a  variant  record  in  Pascal.  It  consists  of  a  set  of  named  cells , 
which  can  hold  values ,  and  an  accessibility  relation  governing  the  order  in  which  the  cells  can 
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be  filled  with  values.  A  cell  filled  with  a  value  is  called  an  event ,  written  c  =  v.  The  following 
definitions  are  adapted  from  Curien  [26]. 

Definition  2.2.1  A  concrete  data  structure  (cds)  is  a  tuple  (C,  V,  E,  b),  where  C,V,E  are  sets  of 
cells ,  values  and  events ,  respectively ,  such  that 

E  C  C  x  V  and  Vc  G  C,  3v  G  V.  (c,v)  G  E 

and  b  is  a  relation,  called  an  accessibility  relation,  between  finite  subsets  of  E  and  elements  of 
C.  We  say  that  {ei,...,en}  is  an  enabling  of  c  if  {e  b  c,  which  may  also  be  written 

ei, . . . ,  en  b  c.  A  cell  such  that  0  b  c  ( which  is  abbreviated  b  c)  is  called  initial 

Definition  2.2.2  A  state  is  a  subset  x  of  E  such  that: 

1 .  (c,Wi),  (c,^)  G  X  =>  Vi  =  V2 

•  (no  cell  is  filled  more  than  once).  This  is  called  consistency. 

2.  If  (c,v)  G  x  then  there  exists  a  sequence  of  events  ei,...,ew  such  that  en  =  (c, u),e*  = 
(ci,Vi)  G  and  {ej  \  j  <  i}  contains  an  enabling  of  c%  for  all  i  <n 

•  (only  enabled  cells  may  be  filled).  This  is  called  safety. 

The  set  of  states  of  a  cds  M  ordered  by  set  inclusion  is  a  partial  order  (D(M),C)  called  a 
concrete  domain.  If  a  domain  D  is  isomorphic  to  D(M ),  we  say  that  M  generates  D. 

Definition  2.2.3  Given  a  state  x  of  a  cds,  we  say  that  a  cell  c  is: 

•  filled  (with  v)  in  x  if  (c,  v)  G  x, 

•  enabled  in  x  if  x  contains  an  enabling  of  c, 

•  accessible  from  x  if  it  is  enabled  but  not  filled  in  x. 

The  sets  of  filled,  enabled,  and  accessible  cells  of  x  are  denoted  F(x),E(x),  and  A{x),  respectively. 

Definition  2.2.4  A  state  y  is  said  to  cover  a  state  x,  written  x-<  y,  if  x  <  y  and  'iz.x  <  z  <  y 
we  have  z  ~y.  In  addition,  we  write  x  <c  y{x^Cc  y)  if  c  G  A(x),c  G  F(y)  and  x  <  y(x-<  y). 

Definition  2.2.5  A  cds  M  =  (C,V,E,  b)  is  well-founded  if  the  transitive  closure  of  the  relation 
<C  defined  on  C  by: 

ci  <£c  if  and  only  if  an  enabling  of  c  contains  an  event  (c\,v) 
is  well-founded,  i.e.,  there  is  no  infinite  descending  sequence  . . .  cn+\  <  cn  <  . . .  <  c, 

Definition  2.2.6  A  cds  is  called  stable  if  for  each  state  x  and  cell  c  enabled  in  x,  c  has  a  unique 
enabling  in  x. 

We  are  only  interested  in  well-founded  and  stable  cds  in  the  sequel.  We  call  such  cds  determin¬ 
istic  (dcds).  In  addition,  we  shall  be  using  an  operational  semantics  for  CDSO  called  CDS02  [7,  30], 
which  requires  the  dcds  to  be  sequential. 
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Definition  2.2.7  A  cds  M  is  called  sequential  if,  for  every  cell  d,  and  each  state  x  of  M  such  that 

d  £  F{x )  and  3 y  >  x.  d  €  F(y), 

there  exists  a  cell  c  such  that: 

c  G  A(x)  and  Vy  >  x,d  G  F(y )  =*>  c  G  F(y). 

Such  a  cell  c  is  called  a  sequentiality  index  of  M  for  d  at  x. 

A  cds  M  such  that  all  its  enablings  contain  at  most  one  event  is  called  filiform.  Filiform  cds’s 
are  particularly  well-behaved,  and  they  make  the  presentation  of  some  definitions  much  simpler. 

Example  2.2.8  We  can  define  the  dcds  of  booleans  (BOOL)  the  following  way:  there  is  one  cell 
called  B,  which  can  be  filled  with  either  tt  or  ff .  The  set  of  states  of  this  dcds  is: 

{{},{B  =  tt},{B=ff}}. 

Note  that  (D(BOOL),  C)  is  isomorphic  to  D^ool ,  the  flat  domain  of  booleans,  i.e.,  BOOL  generates 

LObool  * 

Example  2.2.9  The  dcds  of  integers  INT ,  can  be  defined  in  a  similar  fashion:  there  is  one  cell 
called  N  which  can  be  filled  with  any  integer  value.  Again ,  note  that  INT  generates  Dint,  the  flat 
domain  of  integers. 

Example  2.2.10  We  provide  an  example  of  a  nonsequential  dcds,  which  will  become  relevant 
when  we  present  the  operational  semantics  CDS02.  The  dcds  is  called  STABLE  and  has  four  cells: 

The  cells  Bi  are  all  initial  with  possible  values  tt, ff.  Cell  C  has  any  integer  as  a 
possible  value,  and  the  following  access  conditions : 

{Bi  =  tt,B2=ff}bC 
{B2  =  tt,B3=ff}\~C 
{B3  =  tt,B\  =  ff}  C. 

The  reason  STABLE  fails  to  be  sequential  is  that  we  cannot  determine  sequentially  ifC  is  accessible ; 
each  of  the  access  conditions  omits  one  of  the  Bi  cells,  so  we  wouldn’t  know  where  to  start  to  figure 
out  if  C  is  accessible.  It  lacks  a  sequentiality  index. 

We  can  view  a  product  of  two  cds’s  as  being  composed  of  two  sides:  a  left  and  a  right  hand 
side.  The  product  is  created  by  tagging  the  left  hand  side  cds  cells  with  a  1  and  the  right  hand 
side  cells  with  a  2. 

Definition  2.2.11  Let  M  and  Mf  be  two  cds’s .  The  product  M  x  Mf  =  (C,  V,E,  h)  is  defined  as 
follows: 

•  C  =  {c.l  |  c  E  Cm}  U  {c/-2  |  d  E  Cm'}, 

•V  =  Vm\J  Vm', 

•  E  —  {(c.l,  v)  |  (c,v)  G  Em}  U  {(c'-2,v')  |  (c',v')  G  EM'}, 

•  (cx.l,vi), . . . ,  (cn.l,vn)  h  c.l  if(ci,vi), . . . ,  (cn,  vn)  \~  c  (and  similarly  for  the  right  hand  side). 

It  is  easier  to  visualize  the  product  of  two  cds’s  when  they  are  both  filiform.  Table  2.1  summa¬ 
rizes  the  procedure. 
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M 

M1 

MxM1 

Cells 

c 

d 

c.  1,  d. 2 

Values 

V 

v! 

V,  v1 

Events 

(c,  v) 

(cV) 

(c. (d. 2,v') 

Enablings 

(ci,vi)  1-  e 

H  d 

(ci.l,vi)  1- c.l,  (c',.2,  v{)  h  d .2 

Table  2.1:  Product  of  two  filiform  dcds’s 

2.2.1  Sequential  functions 

Using  cds’s,  Kahn  and  Plotkin  [56]  defined  a  notion  of  sequential  function. 

Definition  2.2.12  A  continuous  function  f  is  sequential  at  some  state  x  in  its  domain ,  if  for  each 
cell  d  accessible  in  f(x)  either: 

1.  no  cell  is  accessible  in  x ,  or 

2.  there  is  an  accessible  cell  c  that  must  be  filled  in  any  state  y  that  is  a  superset  of  x  such  that 
d  is  filled  in  f(y).  The  cell  c  is  called  a  sequentiality  index  of  f  at  x  for  d . 

A  function  is  sequential  if  it  is  continuous  and  sequential  at  every  x  in  its  domain . 

Intuitively,  this  definition  captures  the  notion  that  a  sequential  function  is  at  any  point  depen¬ 
dent  on  one  of  its  inputs;  if  that  input  diverges,  the  function  will  diverge. 

2.2.2  Sequential  algorithms 

Berry  and  Curien  [6]  showed  that  Kahn-Plotkin  cds’s  and  sequential  functions  do  not  form  a 
cartesian  closed  category  (ccc),  hence  they  cannot  be  used  to  model  PCF.  However,  Berry  and 
Curien  defined  sequential  algorithms  on  cds’s,  which  do  form  a  ccc.  This  model  was  not  useful  for 
solving  full  abstraction  for  PCF  because  it  is  intensional  (and  it  is  known  that  the  solution  must 
be  extensional  [63]),  but  that  is  exactly  the  feature  of  interest  for  this  work;  the  meaning  of  a  PCF 
term  is  an  algorithm  and  the  model  is  fully  abstract  with  respect  to  a  notion  of  observability  that  is 
sensitive  to  computation  strategy.  This  is  the  first  instance  of  an  intensional  model  in  the  computer 
science  literature  of  which  we  are  aware. 

Sequential  algorithms  can  be  viewed  two  ways:  abstractly  and  concretely.  Abstractly,  a  sequen¬ 
tial  algorithm  is  a  pair  of  a  sequential  function  and  a  (sequential)  computation  strategy.  If  there 
are  several  ways  of  proceeding  during  the  computation,  the  computation  strategy  points  out  a  par¬ 
ticular  one.  Concretely,  a  sequential  algorithm  is  a  state  of  a  dcds  of  arrow  type  (the  exponentiation 
dcds). 

Definition  2.2.13  Given  two  dcds’s,  M  and  M! ,  the  exponentiation  dcds  M  =>  Mr  is  defined  [26] 
by: 

•  if  x  is  a  finite  state  of  M  and  if  c'  is  a  cell  of  Mf ,  then  xd  is  a  cell  of  M  M* . 

•  the  values  and  events  are  of  two  types: 

1.  type  “valof”:  if  c  is  a  cell  of  M  then  valof  c  is  a  value  of  M  =>  Mr ,  and  (xd ,  valof  c) 
is  an  event  of  M  =>  M'  if  c  is  accessible  from  x, 
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M 

M' 

M=>M' 

Cells 

Values 

Events 

Enablings 

c 

V 

(c,v) 

(ci,Vi)  b  c 

c' 

v' 

(cV) 

K.«i)  hc' 

xd ,  where  x  G  D(M) 
valof  c,  output  v 1 
(. xd ,  valof  c),  where  c  E  A{x), 
{xd ,  output  d) 

(■ yd ,  valof  c)  b  xc ,  if  y-Cc 
(xd^  output  v[)  b  xd 

Table  2.2:  Exponentiation  of  two  filiform  dcds’s 


2.  type  “output”:  if  vf  is  a  value  of  M 1  then  output  vf  is  a  value  of  M  =>  Mf ,  and 
{xd ,  output  vf)  is  an  event  of  M  =>  M*  if  (c',u')  is  an  event  of  M1 . 

•  the  enablings  are  also  of  two  types : 

1.  ( yd,  valof  c )  b  xcf  ify^Ccx  (type  “ valof  ”), 

2.  {x\ output  . . . ,  {xndn,  output  v'n)  b  xcf 

if  x  —  (J{^2  |  i  <  n}  and  (c'1?  u'x), . . . ,  {dn,  vfn)  b  d  (type  “output”). 

A  state  ofM^M’  is  called  a  sequential  algorithm. 

Again,  for  ease  of  reading  we  provide  a  description  of  the  procedure  for  constructing  an  expo¬ 
nentiation  dcds  for  filiform  dcds  in  Table  2.2. 

Example  2.2.14  The  state  of  BOOL  BOOL  that  corresponds  to  the  boolean  negation  is: 

{{}i?  =  valof  5,  {B  =  tt}B  —  output  j ff ,  {B  =  ff}B  =  output  tt}. 

The  way  to  read  this  definition  is:  Given  no  information  about  the  input  and  having  to  fill  the 
output  cell  B ,  we  ask  what  value  the  input  cell  B  holds.  If  the  input  is  true  we  output  false  and 
conversely. 

Berry  and  Curien  [6]  defined  application,  composition,  product,  pairing,  currying,  uncurrying, 
and  fixpoint  for  sequential  algorithms,  and  showed  that  sequential  algorithms  and  cds’s  form  a 
ccc.  They  used  sequential  algorithms  to  construct  an  (intensional)  model  of  typed  A-calculus 
with  recursion;  A-expressions  are  translated  to  categorical  combinators,  which  are  represented  by 
sequential  algorithms. 


2.3  The  language  CDSO 

The  programming  language  CDSO  [5,  6,  7,  30]  is  a  direct  implementation  of  the  intensional  deno- 
tational  semantics  presented  above;  hence,  it  is  an  intensional  programming  language  of  sequential 
algorithms.  The  name  stands  for  Concrete  Data  Structures.  The  initial  idea  [5]  was  to  make  CDSO 
a  kind  of  “assembly”  language  for  a  (syntax-wise)  ML-like  language  called  CDS.  Programs  in  CDS 
would  be  compiled  down  to  CDSO.  Only  CDSO  was  ever  implemented  [30}; 

CDSO  is  a  lazy,  polymorphic,  higher-order,  functional  language  with  several  quite  interesting 
features: 
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•  Uniformity  of  types.  Everything  in  CDSO  is  a  state  of  a  dcds.  This  can  be  a  state-constant 
or  a  higher-order  algorithm.  The  algorithm  syntax  is  just  syntactic  sugar  for  the  state  of  a 
dcds.  Consequently,  an  algorithm  can  be  evaluated  without  being  applied  to  any  argument. 
Operationally  speaking,  terms  of  non-ground  type  can  be  observed. 

•  Full  abstraction.  The  denotational  semantics  of  CDSO,  which  maps  an  algorithm  to  a  state 
of  the  dcds  corresponding  to  its  type  (hence  a  CDSO  object)  is  fully  abstract  with  respect 
to  two  different  operational  semantics  (CDS01  and  CDS02)  [26].  Since  the  semantics  of  an 
algorithm  is  a  CDSO  object  it  is  possible  to  write  algorithms  which  manipulate  the  semantics 
of  other  algorithms. 

•  Demand-driven,  coroutine-like  evaluation  style.  The  user  types  in  an  expression  and  enters  a 
request  loop,  where  questions  about  various  cells  of  the  dcds  that  is  the  type  of  the  expression 
can  be  asked.  If  the  expression  is  a  state-constant  the  value  of  the  cell  is  simply  looked  up 
in  the  state;  if  the  expression  is  compound,  processes  are  associated  with  each  subexpression 
and  they  exchange  information  while  computing  the  value  of  the  cell.  The  computation  style 
is  an  extension  of  the  coroutine  mechanism  of  Kahn  and  MacQueen  [55]. 

•  Rich  data  structure  definition  facilities.  Cds’s  are  very  general  and  permit  definition  of  a  wide 
variety  of  data  structures.  In  particular,  they  can  be  defined  recursively,  and  one  cds  can  be 
grafted  into  another.  A  thorough  discussion  of  the  type  definitions  in  CDSO  will  be  deferred 
to  the  second  part  of  this  thesis,  where  we  develop  a  type  inference  system  for  CDSO. 

The  examples  in  this  section  are  from  our  own  implementation  of  CDSO,  which  follows  that  of 
Devin  [30]  for  the  untyped  part.  We  shall  describe  our  implementation  in  more  detail  in  Chapter  7. 


2.3.1  Type  definitions 

The  types  in  CDSO  are  dcds’s.  Only  ground  dcds’s  can  be  defined;  higher-order  ones  must  be 
created  out  of  pre-existing  dcds’s.  We  begin  by  defining  the  dcds’s  we  have  already  encountered  in 
the  previous  section: 

let  bool  =  dcds 

cell  B  values  tt,  ff 
end; 

let  int  =  dcds 

cell  N  values  [..] 
end; 


let  stable  =  dcds 
cell  B1  values  tt, 
cell  B2  values  tt, 
cell  B3  values  tt, 
cell  C  values  [. .] 
end; 


ff 

ff 

ff 

access  Bl=tt,B2=ff  or  B2=tt ,B3=ff  or  B3=tt,Bl=ff 


The  graft  construct  allows  us  to  copy  an  already  defined  dcds  into  another,  tagging  all  its  cells 
with  a  specified  tag  and  optionally  adding  accessibility  conditions.  Here  is  a  simple  example: 
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let  taggecLbool  =  dcds 
graft  (bool.foo) 
end; 

This  declaration  creates  a  dcds  with  a  single  cell,  B.foo  with  possible  values  tt,ff.  Grafting 
is  more  useful  when  used  in  conjunction  with  recursive  declarations.  We  could  define  a  stream  of 
integers  in  the  following  fashion: 

letrec  int.stream  =  dcds 
cell  (N.l)  values  [..] 
graft  (int^stream.l)  access  (N.l)  =  [..] 
end; 

We  have  created  an  infinite  dcds  with  cells  of  the  form  N.l,  N.l. I, . . .,  each  of  which  can  have 
any  integer  as  a  value,  and  such  that  one  cell  has  to  be  filled  in  order  for  the  next  one  to  become 
enabled.  We  can  explore  in  our  interpreter  the  structure  of  this  dcds,  by  asking  it  to  unroll  the 
dcds: 

#  show  more  3  int_stream; 

{ 

(N.l)  values  [. .] , 

((N.l) . 1)  values  [..]  access  (N.l)=[..], 

(((N.l) .1) .1)  values  [..]  access  ((N.l) .1)=[. .]} 

Cell  names  that  become  ever-longer  as  a  dcds  is  being  unrolled  are  a  typical  feature  of  CDSO 
recursive  type  declarations. 

2.3.2  Interaction  with  the  interpreter 

The  states  we  encountered  in  the  previous  section  can  be  typed  “as  is”  into  the  interpreter.  We 
begin  with  a  state  of  BOOL.  We  omit  typing  considerations. 

#  {B  =  tt}; 
request?  B; 

— >  tt 
request?  ; 

# 

Note  how  we  entered  the  request  loop,  and  examined  the  contents  of  cell  B.  Unsurprisingly,  it  was 
filled  with  tt.  We  can  perform  the  same  kind  of  examination  of  a  higher-order  state,  i.e we  can 
explore  an  algorithm  without  applying  it  to  an  argument.  Here  is  boolean  negation  again: 

#  {  OB  =  valof  B,  {B=tt}B  =  output  ff,  {B=ff}B  =  output  tt  }; 
request?  OB; 

— >  valof  B 

Already,  we  have  some  idea  of  the  computation  strategy  of  this-  algorithm:  we  know  it  examines  its 
input.  By  continuing  the  questions  and  answers  further,  we  can  find  out  what  it  does  with  its  two 
possible  inputs: 
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request?  {B=tt}B; 

— >  output  ff 
request?  {B=ff}B; 

— >  output  tt 

2.3.3  Algorithm  syntax 

Writing  algorithms  solely  in  state  form  would  quickly  become  tedious,  so  an  alternative  syntax  is 
provided.  It  is  important  to  realize,  however,  that  the  notation  is  just  syntactic  sugar  for  a  state. 
An  algorithm  from  M  to  Mf  will  have  the  general  form: 

algo 

request  clJ  do 
<instruction> 
end 

request  c2J  do 
<instruction> 
end 

end 

The  algorithm  contains  a  number  of  request- do  branches,  each  of  which  specifies  a  recipe  for  com¬ 
puting  the  value  of  an  output  cell  c[.  There  are  two  kinds  of  instructions: 

output  v* 

which  outputs  a  value  into  a  cell  of  Mf  and 

valof  c  is 

vl  :  <instruction> 

vn  :  <instruction> 
end 

which  tests  a  certain  input  cell  and  branches  accordingly. 

If  an  output  cell  d  is  not  initial  we  must  specify,  after  the  request-do  construct,  how  it  can 
become  enabled.  This  is  done  with  a  from-do  construct: 

from  < input  state  1>  do 
<instruction> 
end 

from  <input  state  2>  do 
<instruction> 
end 


Two  simple  algorithms  that  work  on  booleans  are  shown  in  Figure  2.3.  The  first  is  the  boolean 
negation,  which  we  have  encountered  already  in  state  form.  The  two  forms  are  equivalent  (they 
actually  map  into  the  same  internal  representation,  as  we  shall  see).  The  second  is  “left  and,” 
which  performs  a  boolean  conjunction,  testing  its  left  input  first;  if  that  input  is  ff  it  outputs  the 
value  ff  right  away,  so  it  is  not  strict  in  both  arguments.  We  can  actually  write  four  different 
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let  not  = 
algo 

request  B  do 
valof  B  is 

tt  :  output  ff 
ff  :  output  tt 
end 
end 
end; 


let  land  = 
algo 

request  B  do 
valof  (B . 1)  is 

tt:  valof  (B.2)  is 
tt :  output  tt 
f f :  output  f f 
end 

ff :  output  f f 
end 
end 
end; 


Figure  2.3:  Boolean  negation  and  left  conjunction  algorithms 


kinds  of  boolean  conjunction  algorithms:  “left  and,”  “left  strict  and”  (which  would  check  its  right 
argument  even  if  the  left  one  is  ff),  “right  and,”  “right  strict  and.” 

An  example  of  an  algorithm  which  uses  the  from-do  construct  is  shown  in  Figure  2.4.  The 
algorithm  acts  like  the  identity  on  the  initial  cells  of  STABLE ,  but  then  distinguishes  how  cell  C 
became  enabled. 

2*3.4  Polymorphism 

Polymorphism  arises  in  CDSO  through  the  use  of  generic  (i.e.,  variable)  cell  and  value  names. 
Variable  names  start  with  the  special  symbol  “$”.  For  example,  this  is  how  we  could  write  the 
polymorphic  identity: 

let  id  =  algo 

request  $C  do 
valof  $C  is 

$V  :  output  $V 

end 

end 

end; 

Note  that  the  §C  from  the  output  name  is  the  same  as  the  input  one,  and  similarly  for  the 
values.  The  output  cell  name  gets  bound  to  a  non-variable  name  first,  when  a  query  is  issued. 
When  the  answer  is  returned,  the  input  value  gets  bound  to  a  non- variable  first,  then  is  copied  to 
the  output  value.  To  make  this  clearer,  we  evaluate  id  by  itself: 

#  id; 

request?  {}B; 

— >  valof  B 
request?  {B=tt}B; 

— >  output  tt 
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let  distinguish  = 
algo 

request  B1  do 
valof  B1  is 
tt  :  output  tt 
ff  :  output  ff 
end 
end 

request  B2  do 
valof  B2  is 
tt  :  output  tt 
ff  :  output  ff 
end 
end 

request  B3  do 
valof  B3  is 
tt  :  output  tt 
ff  :  output  ff 
end 
end 

request  C  do 

from  {Bl=tt,  B2=ff}  do 
output  1 
end 

from  {B2=tt,  B3=ff}  do 
output  2 
end 

from  {B3=tt,  Bl=ff}  do 
output  3 
end 
end 
end; 


Figure  2.4:  Algorithm  utilizing  from-do  construct 
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Combinator 

Syntax 

Argument  Type 

Result  Type 

Application 

A.B 

A:D- ■>  D',  B-.D 

A.B-.D' 

Composition 

A\B 

A: D'  ->  D",  B-.D  D' 

A\B:D  ->•  D" 

Fixpoint 

fix  (A) 

A:D  ->  D 

fix(A):  D 

Curry 

curry(A) 

A:  D  x  D'  -)•  D" 

curry  (A) :  D  — >  Df  Dn 

Uncurry 

uncurry  (A) 

A:D  D1  D" 

uncurry(A):  D  x  Df  Dn 

Pair 

<  A,  B  > 

A:  D  — >  D\,  B:  D  — >  D2 

<  A,  B  >:  D  ->  Di  x  D2 

Product 

(A,B) 

A:  D\ ,  B :  D2 

(A,B):Di  x  D2 

Table  2.3:  CDSO  combinators 

Since  id  has  type  Va.  a  -4  a,  its  cells  will  be  higher-order.  First  we  asked  what  the  output  cell 
B  is,  given  no  information  about  the  input,  and  id  replied  that  it  needs  to  know  the  value  of  the 
eponymous  input  cell.  When  given  that  value,  it  simply  copied  it  over  to  the  output. 

2.3.5  Categorical  combinators 

Sequential  algorithms  and  states  can  be  combined  to  form  expressions  using  the  categorical  combi¬ 
nators.  There  are  seven  combinators:  application  composition  (“|”),  fixpoint  (“fix”),  curry 

(“curry”),  uncurry  (“uncurry”),  pair  (“<  ,  >”),  and  product  (“(  ,  )”).  For  ease  of  reference,  we 
list  the  combinators,  along  with  their  types  in  Table  2.3.  The  language  of  expressions  is  given  by 
the  following  grammar,  where  x  stands  for  a  state-constant,  and  a  for  an  algorithm  declaration: 

e  : :  =  x  \  a  \  e.e  \  e\e  \  fix(e)  |  curry(e)  \  uncurry(e)  |<  e,  e  >|  (e,  e). 

As  an  example  of  the  use  of  combinators,  we  examine  two  simple  expressions  in  the  interpreter: 

#  not.{B=tt}; 
request?  B; 

— >  ff 
request?  ; 

#  not  I  land; 
request?  {}B; 

— >  valof  (B.l) 
request?  {(B.l)=ff}B; 

— >  output  tt 

2.3.6  Forest  representation 

Before  presenting  the  operational  semantics,  we  introduce  the  internal  representation  of  forests  for 
sequential  algorithms  developed  by  Devin  [30].  Sequential  algorithms  already  have  a  tree  structure, 
so  forests  are  quite  similar.  The  two  differences  are: 

1.  The  lists  of  from-do  instructions  become  a  tree  of  From  instructions,  similar  to  valofs.  This 
is  possible  when  we  restrict  ourselves  to  sequential  dcds’s. 

2.  When  asking  the  value  of  an  input  cell,  we  specify  which  input  it  comes  from,  i.e.,  given  a 

type  M\  — >  M2  — >  - - >  Mn,  when  we  perform  a  valof  of  cell  c,  from  M, ,  this  becomes  a  Valof 
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let  curry_land  = 
algo 

request  OB  do 
valof  B  is 

tt:  output  valof  B 
f f :  output  output  f f 
end 
end 

request  {B=tt}B  do 
from  {B=tt>  do 
output  output  tt 
end 
end 

request  {B=ff}B  do 
from  {B=tt}  do 
output  output  ff 
end 
end 
end; 


#  print  land; 
land  = 

{B=valof  ( (B . 1) ,  1)  is 

tt  :  valof  ((B.2),  1)  is 
tt  :  output  tt 
ff  :  output  ff 
ff  :  output  ff 

} 


#  print  curry_land; 
curry_land  = 

{B=valof  (B,  1)  is 

tt  :  valof  (B,  2)  is 

tt  :  output  output  tt 
ff  :  output  output  ff 
ff  :  output  output  ff 

} 


Figure  2.5:  Curried  left  conjunction  and  internal  representations 


(cj,i)  instruction.  This  has  the  effect  of  making  currying  and  nncurrying  a  simple  game  on 
input  cell  indices  and  tags. 

A  forest  contains  several  trees,  each  one  containing  an  instruction  specifying  how  to  compute 
one  output  cell:  Tree  (ci,  instruction ... ,  Tree  ( On,  instruction  n ).  The  instructions  are  of  three 
kinds: 

Valof  (c,  i)  is  From  (c,i)  is 

~  7l  ,  f7J  v\  :  instruction i  v\  :  instruction i 

Result  output n  v 

end  end 

Figure  2.5  shows  a  sequential  algorithm  for  the  curried  version  of  “left  and”  and  the  internal 
representation  as  a  forest  for  this  algorithm  and  the  earlier  presented  “left  and.”  Note  that  despite 
their  very  different  syntax  trees,  the  two  algorithms  map  into  very  similar  forests,  with  the  only 
differences  occurring  in  the  number  of  inputs  and  the  product  tags  on  the  input  cells. 

We  are  now  ready  to  present  the  operational  semantics  for  the  evaluation  of  forests.  (All  the 
CDSO  operational  semantics  rules  are  summarized  in  Appendix  A.l  for  ease  of  reference.)  The 
rules  are  of  the  form  forest  ?  c  — *  v.  We  introduce  a  special  value,  which  stands  for  an  unfilled 
cell.  This  is  different  from  _L:  When  attempting  to  find  a  cell’s  value,  the  interpreter  may  loop,  in 
which  case  the  value  would  be  _L.  It  is  possible,  however,  that  the  interpreter  can  figure  out  that 
the  cell  is  not  filled  in  that  state,  in  which  case  the  result  would  be  f?.  The  first  two  rules  specify 
the  search  of  a  forest  for  the  proper  tree. 


Tree  (c'l5  ins\), . . . ,  Tree  (c^,  insn)  ?  x\  •  •  ■  xnd  —>  insi  ?  x\  •  *  •  xnd 


(TreeI) 
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(Tree2) 


_ V*-  4  /  c' _ 

Tree  (c'1?  insi), . . . ,  Tree  ( dn ,  msn)  ?  aq  •  •  *  xnd  ->  ft 


When  executing  a  Result  instruction,  we  simply  output  the  resultant  value. 
(Result)  Result  v*  ?  x\  •  •  •  xnc'  — *  t/ 


A  Valof  (ci,i)  of  a  cell  aq  •  •  •  xnd  specifies  that  we  need  to  ask  the  i’th  state  part  of  the  cell 
name  x\  •  •  •  xnd  what  value  Ci  has.  If  that  sub-query  returns  a  value  we  look  up  the  appropriate 
branch  of  the  Valof .  If  no  branch  matches  we  return  ft,  and  if  the  sub-query  returns  ft  we  return 
an  answer  that  specifies  that  we  still  need  the  value  of  cell  C{. 


(Valof) 


f  Vi 

Xp  ?  c  4 

v,  and  Vi.  Vi  ^  v 

[  0 

Valof  (c,p)  is 
v\  :  ins\ 


end 


?  aq  •  *  •  xnd 


insi  ?  aq  •  *  •  #nc' 

<  ft 

output v~l  valof  c 


From  instructions  are  very  similar  to  Valof.  When  the  sub-query  returns  a  value  not  in  the  list 
we  fail  by  raising  an  exception,  because  the  From  cannot  be  satisfied.  If  the  sub-query  returns  ft, 
we  also  return  ft  without  failing,  because  it  might  still  be  possible  to  increase  the  input  state  and 
potentially  satisfy  the  From. 


(From) 


Vi 


Xp  ?  c  — > 


<  v,  and  Vi.  Vi^v 
{  ft 


From  (c,p)  is 
Vi  :  insi 


>  ?  x\  ■  *  •  xnd 


end 


insi  ?  x\  •  •  •  #nc' 
fail  with  no-access 
ft 


2.3.7  CDS02  operational  semantics 

The  first  operational  semantics  devised  for  CDSO  was  called  CDS01  [26],  and  it  involved  the  use  of 
tables  to  store  temporary  results  when  evaluating  certain  constructs.  The  reason  for  the  tables  is 
that  in  some  cases  it  is  impossible  to  re-derive  the  enablings  that  allowed  us  to  reach  a  certain  point. 
Consider  applying  the  algorithm  distinguish  from  Figure  2.4,  of  type  STABLE  “4  STABLE,  to  a 
state  of  STABLE ,  let  us  call  it  arg ,  in  which  attempting  to  evaluate  B\  loops,  but  B2  =  tt,  B$  =  ff. 
If  we  have  already  evaluated  B<2,B$,  then  distinguish. arg  ?  C  should  return  2.  But  if  we  did  not 
store  the  previously  evaluated  Bz,Bs,  we  would  have  no  way  of  knowing  how  to  go  about  finding 
C°s  value,  and  we  would  loop  if  we  chose  to  evaluate  B\. 

The  tables  in  CDS01  were  only  necessary  for  application,  composition,  and  fixpoint.  Carrying 
previously  evaluated  values  around  led  to  efficient  evaluation  of  fixpoints,  but  in  general  CDS01 
was  inefficient  in  both  space  usage  and  time,  because  of  searches  in  the  tables. 

In  CDS02  [30]  the  restriction  is  made  that  all  dcds’s  be  sequential,  so  STABLE  is  outlawed  and 
one  no  longer  has  to  keep  the  tables.  The  restriction  to  sequential  dcds’s  is  not  essential  in  practice 
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since  most  usual  data  structures  are  sequential.  CDS02  is  slower  in  the  evaluation  of  fixpoints,  but 
is  overall  more  efficient  than  CDS01. 

It  is  possible  to  pose  intensional  queries  in  CDS02.  Whereas  in  CDS01  states  are  always 
explicitly  given  by  enumeration  of  events,  in  CDS02  the  interpreter  manipulates  expressions.  The 
rule  for  application,  given  below,  illustrates  the  point. 


(App) 


A  ?  £c'-H 

r  a 

valof  c 
[  output  v' 

\  Q 

A.B  ?  d  { 

n 

v' 

Instead  of  constructing  approximations  x  to  the  state  of  B  and  posing  queries  of  the  form  A  ?  xd , 
as  would  be  done  in  CDS01,  we  simply  package  the  expression  B  with  the  cell  d  and  ask  that 
question  of  A.  This  essentially  leads  to  a  direct  dialog  between  “interested  parties”  rather  than 
having  it  be  centralized  through  the  use  of  tables. 

The  rules  for  composition  and  fixpoint  are  in  the  same  vein. 


(Comp) 


A  ?  {B.x)c"  < 

(  n 

1  valof  d  Blxd  — >  valof  c 

1  output  vn 

A\B  ?  xc"  -t  < 

f  n 

!  valof  c 

1  output  vr 

(Fix) 


A  ?  fix(A)c^>-  < 

f  n 

valof  c' 

[  output  V 

f  n 

fix(A)  ?  c->  < 

0 

l  v 

The  rule  for  fixpoint  is  actually  an  optimization  that  builds  in  one  application  step  of  the 
following  rule: 

(Fix’)  fix  (A)  ?  c—>  A.fix(A)  ?  c 

The  remaining  rules  are  simple  games  on  the  product  tags  of  a  cell.  In  the  rule  for  uncurry, 
7Ti,7T2  are  the  first  and  second  projections. 

(Pair)  <  Ai, . . . ,  An  >  ?  x(c.i)  A*  ?  xc 

JJ  Ai  ?  (c.i)  ->Ai?c 


(Prod) 
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(Curry) 


ft 


A  ?  (x  x  y)c"  < 


valof  (c.l) 
valof  ( c'.2 ) 
output  v" 


ft 


curry(A)  ?  xyd'  — >  < 


valof  c 

output  valof  c' 
output  output  v" 


(Uncurry) 


f 

Q 

A  ?  (ni.x) (7^2 ^y)cff  — *  < 

I 

valof  c 

output  valof  cf 

[  output  output  v” 

ft 

uncurry  (A)  ?  xcn  — >  < 

valof  (c.l) 
valof  (c'.2) 
output  v" 

2.3.8  Related  languages 

A  language  in  some  ways  similar  to  CDSO  has  recently  been  developed  by  Cartwright,  Curien, 
and  Felleisen  [14].  It  is  called  SPCF  (Sequential  PCF)  and  it  extends  PCF  with  error  values  and 
primitives  for  non-local  transfer  of  control  (catch  and  throw).  This  enables  an  SPCF  program  to 
observe  and  exploit  order  of  evaluation  in  other  programs.  SPCF  programs  are  called  observably 
sequential  algorithms.  If  there  are  no  errors,  SPCF  collapses  into  CDSO.  In  the  presence  of  errors, 
SPCF  is  an  extension  of  CDSO. 


2.4  Parallel  algorithms  on  concrete  data  structures 

Brookes  and  Geva  [12]  proceeded  to  extend  Berry  and  Curien’s  work  to  the  setting  of  deterministic 
parallel  algorithms  on  concrete  data  structures.  The  aim  was  to  provide  a  general  intensional  theory 
of  deterministic  parallel  computation. 

A  parallel  algorithm  can  also  be  viewed  two  ways.  Abstractly,  it  is  a  pair  of  a  continuous 
function  and  a  (parallel)  computation  strategy.  Concretely,  it  is  a  program  in  a  language  of  parallel 
algorithms. 

The  key  change  to  sequential  algorithms  to  yield  parallel  ones  is  to  replace  the  valof  construct 
with  a  parallel  query  construct,  which,  intuitively,  spawns  off  a  number  of  valof s.  More  precisely, 
a  query  starts  a  number  of  parallel  sub-computations  and  specifies  conditions  based  on  the  results 
of  the  sub-computations  under  which  the  main  computation  may  resume.  As  an  example,  here  is 
how  we  could  write  parallel-or  (of  type  BOOL2  — >  BOOL)  in  syntax  meant  to  look  like  CDSO  as 
much  as  possible: 
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let  por  = 
algo 

request  B  do 

query  {(B.l),  (B.2)>  is 
{tt,  _}  :  output  tt 

{_,  tt}  :  output  tt 
{ff,  ff}  :  output  ff 
end 
end 
end; 

The  previous  algorithm,  while  deterministic,  is  a  little  misleading,  because  it  seems  to  imply  that 
one  has  knowledge  of  which  argument  evaluated,  so  one  could  write  a  non-deterministic  program. 
To  ensure  determinism  we  have  to  make  sure  the  output  is  the  same  for  all  consistent  input  states. 
Another  problem  is  caused  by  higher-order  parallel  algorithms:  their  queries  have  to  apply  not 
only  to  the  immediate  input,  but  also  possible  further  inputs.  To  account  for  all  this,  the  notation 
would  have  to  be  somewhat  different  (see  [12]). 

Brookes  and  Geva  were  able  to  define  application,  currying  and  uncurrying  of  parallel  algo¬ 
rithms.  However,  this  only  works  for  first-order  types  and  composition  was  not  defined.  Thus,  they 
did  not  obtain  a  ccc  of  parallel  algorithms  on  concrete  data  structures. 

It  is  important  to  note  that  Berry  and  Curien  also  had  difficulty  defining  composition  for 
sequential  algorithms;  their  solution  was  to  define  it  in  terms  of  the  abstract  view  of  sequential 
algorithms. 


2.5  Applications  of  sequential  algorithms 

A  sequential  algorithm  contains  detailed  information  about  the  relationship  between  input  and 
output.  It  does  not  simply  tell  us  how  the  output  depends  on  the  input,  but  precisely  how  parts 
of  the  output  depend  on  parts  of  the  input.  Seizing  on  this  intensional  information,  Hughes  and 
Ferguson  [33,  50]  developed  applications  using  sequential  algorithms  as  a  representation  for  func¬ 
tional  programs.  The  programs  are  translated  to  categorical  combinators,  which  are  represented 
by  sequential  algorithms.  The  sequential  algorithms  they  use  are  a  somewhat  simplified  version  of 
Berry  and  Curien’s.  The  tree  structure  of  algorithms  is  made  explicit,  and  cell  names  become  the 
concatenation  of  labels  found  on  branches  of  a  tree  from  the  root  to  a  leaf.  Values  are  stored  at 
the  leaves.  No  provision  is  made  for  the  from  construct;  in  our  vocabulary,  only  valof  and  result 
nodes  exist.  The  operational  semantics  employed  appears  to  be  a  version  of  CDS01. 

In  [50]  a  loop-detecting  interpreter  for  a  lazy,  higher-order  language  is  described.  The  standard 
approach  to  detecting  loops  is  to  check  for  a  recursive  function  being  called  twice  with  the  same 
arguments.  But  this  does  not  work  for  higher-order  functions.  Using  sequential  algorithms  one 
can  get  around  this  problem.  The  knowledge  of  what  input  cells  a  particular  output  cell  depends 
on  makes  it  possible  to  detect  when  a  cell  depends  on  itself.  To  do  this,  whenever  we  encounter 
a  valof  c  while  trying  to  compute  the  value  of  a  cell,  we  keep  track  of  c  and  all  the  cells  c  itself 
depends  on.  A  cell  is  called  detectably  bottom  when  it  either  depends  on  itself,  or  it  depends  on  a 
detectably  bottom  cell. 

When  having  to  answer  the  question  fix  f  ?  c,  Hughes  and  Ferguson  attempt  to  evaluate  c  in 
the  chain  of  increasing  approximations  to  the  fixpoint  of  /:  _L,  /._ L,  /2._ L, . . . ,  either  getting  a  value, 
or  showing  that  c  is  detectably  bottom.  The  key  point  is  that,  in  a  finite  dcds,  there  is  a  bound  on 
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the  number  of  unrollings  of  the  fixpoint  that  must  be  made  before  it  becomes  clear  that  c  cannot 
filled;  this  bound  is  simply  the  number  of  distinct  cells  in  the  dcds. 

In  a  later  paper  [33],  an  abstract  interpretation  based  on  sequential  algorithms  is  developed  for  a 
higher-order,  lazy  language.  The  implementation  is  reported  as  being  orders  of  magnitude  faster  for 
higher-order  programs  than  competing  approaches,  such  as  frontiers  [45]  and  pending  analysis  [88]. 
The  problem  the  implementation  suffers  from  is  space-inefficiency.  This  does  not  seem  surprising, 
given  that  it  uses  a  CDSOl-like  operational  semantics.  In  more  recent  work,  Hughes,  Hunt,  and 
Runciman  [51]  report  on  attempts  at  overcoming  this  problem. 

Our  own  approach  to  abstract  interpretation  and  loop-detection  is  somewhat  different  because 
we  are  using  CDS02.  We  shall  discuss  this  issue  at  length  when  we  present  our  refinement  type 
inference  system  in  Chapter  6. 


2.6  Refinement  type  inference  for  Standard  ML 

In  the  Freeman-Pfenning  framework  for  refinement  type  inference  [35,  36],  only  datatypes  can 
be  refined,  and  the  refinements  are  specified  using  a  rectype  statement.  For  example,  here  is  a 
polymorphic  version  of  the  refinement  mentioned  in  Section  1.3,  of  empty,  one  element,  and  two  or 
more  element  lists: 

datatype  a  list  =  nil  |  cons  of  a  *  a  list 
rectype  a  empty  =  nil 

and  a  singleton  =  cons  (a,  nil) 
and  a  long  =  cons  (a,  cons  (a,  ct  Tust)) 
and  a  _L nst  =  bottom  (list) 

This  definition  would  result  in  the  following  refinement  type  lattice: 

&  T nst 


a  empty  a  singleton  a  long 


&  AnSf 

Note  that  the  refinement  types  are  kept  separate  from  the  regular  types.  In  particular,  the 
refinement  type  a  Tnst  gets  automatically  generated  to  correspond  to  the  regular  type  a  list . 

Given  such  a  rectype  declaration,  the  Freeman-Pfenning  system  automatically  generates  the 
following  type  for  the  constructor  cons : 

cons  :  (a  *  a  empty)  —>  a  singleton  A 
(a  *  a  singleton)  a  long  A 
(a  *  a  long)  — >  a  long. 

In  addition,  their  system  can  infer  the  following  type  for  the  polymorphic  map  function  (written 
in  a  slightly  different  way  than  the  example  from  Section  1.3): 

Vo:/?,  (a  — >  /?)  — *  a  empty  ->  (5  empty  A 

a  singleton  — >  singleton  A 
a  long  — >  (3  long. 
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The  ability  to  define  refinements  of  parametrized  types  enables  rectype  declarations  such  as  the 
following,  which  distinguishes  even  and  odd  length  lists  of  booleans  ( runit  stands  for  the  empty 
tuple  refinement  type): 

datatype  blist  =  nil  |  cons  of  bool  *  list 
rectype  bev  =  cons  ( Tbooi  *  bod)  I  nil  [runit) 
and  bod  =  cons  (T bool  *  bev) 

The  refinement  type  inference  algorithm  works  (roughly)  by  obtaining  a  regular  type  for  an 
expression,  then  trying  out  all  possible  refinements  of  that  type,  and  using  refinement  type  inference 
rules  to  reach  a  result  type.  In  the  case  of  refinement  type  variables,  all  possible  instantiations  at 
a  particular  type  must  be  considered.  Pending  analysis  is  used  for  fixpoints. 

The  main  difficulties  with  this  approach  seem  to  be  caused  by  instantiations  of  polymorphic 
refinement  type  variables.  There  are  two  problems:  In  many  cases,  one  cannot  use  the  polymorphic 
version  of  a  function  and  get  the  best  refinement  type,  and,  when  higher-order  functions  are  used, 
the  number  of  possible  refinements  gets  very  large,  which  leads  to  inefficiency. 

As  an  example,  suppose  we  have  refined  bool  by  true  and  false ,  and  we  want  to  derive  a 
refinement  type  for  the  following  program: 

let  val  not  =  fn  x  =>  if  x  then  false  else  true 
val  double  =  fn  f  =>  fn  x  =>  f  (f  x) 
in  double  not  true 
end; 

The  type  of  double  is  Va.  (a—>  a)  a.  When  double  is  applied  to  not ,  since  the  regular  type 

of  not  is  bool  -»  600/,  we  need  to  instantiate  a  to  all  possible  refinements  of  bool,  which  leads  to  the 
following  refinement  type  for  double : 

(true  — >  true)  — >  true  true  A 
(false  — >  false)  — >  false— ^  false  A 

(^ bool  ^ bool)  bool  bool  A 

(1-bool  1-bool)  Ibool  Ibool- 

Using  this  refinement  type  for  double ,  since  not  has  refinement  type  true  — >  false  A  false  — >  true 
(and,  implicitly,  T bool  T bool )>  the  best  refinement  type  we  can  get  for  double  not  is  T bool  T booh 
and  hence  the  entire  program  has  type  T booh  The  only  way  the  more  precise  type  of  true  can  be 
obtained  is  if  double  is  no  longer  a  polymorphic  function.  If  we  specify  that  variable  x  in  double 
is  really  a  boolean,  the  refinement  type  of  double  becomes  an  intersection  of  112  components, 
including  pieces  that  enable  us  to  infer  true  — ^  true  A  false-*  false  as  the  refinement  type  for 
double  not .  Using  that  type,  we  can  get  true  as  the  type  for  the  program. 

2.7  Colson’s  work  on  intensional  expressiveness 

We  could  define  the  minimum  of  two  integers  in  unary  representation  in  a  natural  way  by  the 
following  rewrite  system: 

min(x ,  0)  =  0 
min(  0,x)  =  0 

min(S(x),S(y))  =  S(min(x,y)) 
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Note  that  this  is  not  a  primitive  recursive  (VIZ)  algorithm;  there  is  simultaneous  recursion  on  two 
inputs. 

We  need  to  distinguish  between  the  function  min  defined  above,  and  an  algorithm  mina  for 
min.  Intuitively,  by  applying  the  rewrite  rules,  the  algorithm  mina(n,p)  computes  its  result  in 
0(min(n,p ))  time  (it  takes  exactly  min(n,p)  +  1  steps). 

Colson  studied  the  expressibility  of  minimum  in  the  context  of  VIZ  algorithms  [19,  20].  He 
established  that  VI Z  algorithms  are  inherently  sequential:  like  sequential  algorithms,  they  possess 
sequentiality  indices.  Moreover,  V7Z  algorithms  are  sequential  in  an  even  stronger  sense.  They  suffer 
from  “ultimate  obstination”  [20,  22]:  at  some  point  one  argument  must  be  chosen  to  be  evaluated 
until  the  end.  Using  primarily  the  intensional  semantics  of  lazy  natural  numbers  (LNAT),  which 
we  exhibited  earlier,  he  proved  two  main  results: 

Proposition  2.7.1  There  is  no  VIZ  algorithm  a  of  arity  2  satisfying: 

H(5n(±),5p(±))  =  smin(n’pX±). 

Proposition  2.7.2  There  is  no  VIZ  algorithm  which  computes  the  minimum  of  two  numbers  n 
and  p  in  unary  representation,  and  is  of  complexity  0(min(n,p)). 

However,  there  are  many  VIZ  algorithms  which  compute  the  minimum  of  two  integers.  We 
define  one  below,  using  some  auxiliary  functions  (see  [57]): 

pred(  0)  =  0 
pred(S(x))  =  x 


sub(x ,  0)  =  x 

sub(x,  S(y))  —  pred(sub(x,y)) 


MIN(x,y)  —  sub(x,sub(x,  y)) 

Note  that  in  an  operational  interpretation  of  this  definition,  the  algorithm  MINa(n,p)  for  MIN 
has  a  worst-case  running  time  of  0(max(n,p)).  Let  us  call  the  elements  of  LNAT  of  the  form  Sk( 0) 
defined ,  and  the  elements  of  the  form  Sk(±)  partial.  The  function  MIN  agrees  with  min  on  the 
defined  elements  of  LNAT.  They  are  different  on  the  partial  elements.  By  the  LNAT  semantics  we 
have: 

lmin}(Sn(±),Sp{±))  =  Smin^p\±) 
lMIN](Sn{±),Sp(±.))  =  1 

We  can  view  Proposition  2.7.1  as  an  extensional  result:  VIZ  algorithms  can  compute  MIN 
but  not  min.  Note  that  there  are  many  other  functions  between  min  and  MIN  in  the  pointwise 
order.  But  it  is  the  intensional  aspect  of  Proposition  2.7.2  that  is  particularly  interesting  here:  VIZ 
algorithms  cannot  compute  minimum  efficiently. 

If  we  augment  V7Z  algorithms  with  functional  arguments,  we  arrive  at  Godel’s  system  T  [39]. 
In  system  T  we  can  not  only  compute  new  functions  (e.g.,  the  Ackermann  function),  but  we  can 
also  compute  minimum  efficiently  (system  T  can  express  an  algorithm  for  min  [19]).  Thus,  system 
T  is  more  powerful  than  VI Z  both  extensionally  and  intensionally. 

Colson’s  results  are  the  first  intensional  expressiveness  results  for  programming  languages  of 
which  we  are  aware. 


Chapter  3 

Expressing  Minimum 


In  this  chapter,  we  begin  our  intensional  explorations,  by  studying  the  expressibility  of  the  min¬ 
imum  of  two  lazy  natural  numbers  in  CDSO.  We  expected  to  obtain  results  similar  to  Colson’s 
in  our  study  of  sequential  algorithms.  After  all,  CDSO  is  a  sequential  programming  language  by 
design:  sequential  algorithms  compute  sequential  functions.  It  turns  out,  however,  that  sequential 
algorithms  are  sufficiently  more  powerful  than  V7Z  algorithms  to  be  able  to  compute  minimum 
efficiently,  but  not  powerful  enough  to  compute  the  “natural”  min  function  from  Section  2.7.  The 
parallel  query  construct  of  Brookes  and  Geva  allows  us  to  compute  that  function.  This,  of  course, 
raises  the  question  of  whether  the  addition  of  query  increases  the  intensional  expressiveness  of  the 
language.  We  show  that  it  does;  in  particular,  the  computation  of  various  n-ary  functions  can  be 
speeded  up.  However,  this  assumes  non-parallel  evaluation  of  CDSO. 

Section  3.1  defines  the  dcds  of  lazy  natural  numbers  and  shows  how  it  can  be  implemented  in 
CDSO,  along  with  various  algorithms  on  the  lazy  natural  numbers.  In  Section  3.2  we  show  that 
CDSO  cannot  compute  the  min  of  Section  2.7,  but  can  compute  minimum  efficiently.  We  exhibit 
an  algorithm  to  do  this.  We  introduce  the  extension  of  CDSO  with  query,  named  CDSP,  and  define 
its  semantics  in  Section  3.3.  Section  3.4  shows  the  comparison  of  CDSO  and  CDSP. 

3.1  Implementing  lazy  natural  numbers  in  CDSO 

LNAT,  the  dcds  of  lazy  natural  numbers,  is  defined  as  follows:  It  has  cells  bn,  for  n  >  0,  values  0 
and  1,  and  the  following  accessibility  relation:  bo  is  initial,  and  {bi  =  1}  b  b.L+\  (filling  a  cell  with  1 
enables  the  next  cell) .  Intuitively,  filling  a  cell  with  1  means  there  might  be  more  to  follow,  whereas 
0  means  we  are  done.  (D(LNAT),C)  is  isomorphic  to  the  domain  LNAT  from  the  introduction. 
The  encoding  of  the  lazy  natural  numbers  is: 

S"(-L)  =  {bi  =  1  |  i  <  n}, 

5”(0)  =  {bi  =  1  |  i  <  n}  U  {bn  =  0},  for  n  >  0, 

S“(±)  =  {bi  =  1  |  i  >  0}. 

This  mathematical  definition  of  LNAT  can  be  implemented  as  CDSO  code  in  the  following  way: 

letrec  lnat  =  dcds 

cell  B  values  0,1 

graft  (lnat.s)  access  B  =  1 

end; 
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We  ask  the  interpreter  to  unroll  the  definition  by  displaying  the  first  few  cells  and  their  access 
conditions: 

#  show  more  3  lnat; 

{ 

B  values  0,  1, 

(B.s)  values  0,  1  access  B=l, 

((B.s).s)  values  0,  1  access  (B.s)=l> 

Now  let  us  define  a  few  constants:  JL,  0,  £>(_L),  1,  and  Su(±): 

let  Bot  =  {}; 
let  Zero  =  {B=0}; 
let  SJbot  =  {B=l}; 
let  One  -  {B=l, (B.s)=0}; 

let  Srec  =  algo 

request  B  do 
output  1 
end 

request  ((B.$V).s)  do 
valof  (B.$V)  is 
1  :  output  1 
end 
end 

end; 

let  S_omega_bot  =  fix(Srec); 

Su(±)  is  defined  as  the  least  fixpoint  of  the  algorithm  which  in  the  base  case  fills  B  with  1,  and 
recursively,  if  the  previous  cell  contains  1,  puts  1  into  the  current  cell.  All  the  algorithms  we  shall 
write  on  LNAT  will  have  a  similar  structure.  Note  how  we  have  used  a  variable  ($V)  to  stand  for 
a  sequence  of  tags  of  .5  of  any  length  (including  0). 

Now  we  can  write  the  successor  algorithm;  it  is  shown  in  Figure  3.1.  It  is  only  slightly  more 
complicated  than  the  algorithm  for  Sw(i_),  but  warrants  further  explanation  because  it  is  higher- 
order.  Successor  is  defined  as  the  fixpoint  of  a  higher-order  algorithm  and  it  works  as  follows:  If 
asked  what  B  is,  it  immediately  outputs  1  (the  successor  of  anything  is  at  least  5(_L)).  In  the 
general  case,  if  asked  what  value  an  output  cell  holds,  it  asks  what  value  the  input  cell  immediately 
preceding  it  holds,  and  outputs  the  same  value. 


3.2  CDSO  and  minimum 

We  begin  by  showing  that  sequential  algorithms  cannot  compute  min.  The  proof  follows  standard 
lines  ( cf .  [6,  12]). 


Proposition  3.2.1  There  is  no  sequential  algorithm  computing  min. 
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let  succ_rec  = 
algo 

request  -QB  do 
output  output  1 
end 

request  {}((B.$V).s)  do 
output  valof  (B.$V) 
end 

request  {(B.$V)=0}((B.$V) .  s)  do 
output  output  0 
end 

request  {(B.$V)=1}((B.$V)  .s)  do 
output  output  1 
end 

end; 

let  S  =  f ix(succ_rec) ; 


Figure  3.1:  The  successor  algorithm 


Proof:  A  sequential  algorithm  computes  a  sequential  function.  But  min  is  not  sequential,  since 
it  has  no  sequentiality  index  at  (_L,_L)  for  output  cell  bo-  In  other  words,  there  is  no  input  cell 
which  must  be  filled  in  order  for  min  to  fill  bo-  (Actually,  min  has  no  sequentiality  index  at  any 
(5n(_L),  Sn(_L))  for  bn,  n  >  0.)  Therefore,  no  CDSO  algorithm  can  compute  min.  □ 

But  this  does  not  mean  we  cannot  compute  minimum  efficiently  in  CDSO.  Recall  that  the 
problem  with  VIZ  algorithms  was  that  they  become  “fixated”  on  one  input.  Sequential  algorithms 
allow  us  to  keep  alternating  between  the  two  inputs,  examining  one  cell  at  a  time. 

Proposition  3.2.2  There  is  a  sequential  algorithm  which  computes  the  minimum  of  two  numbers 
n  and  p  in  unary  representation ,  and  is  of  time  complexity  0(min(n:p)). 

Proof:  The  actual  algorithm  is  listed  in  Appendix  B.l.  Even  though  it  looks  rather  complicated  it 
has  the  same  basic  structure  as  the  previous  LNAT  algorithms.  For  the  purpose  of  the  presentation, 
we  shall  assume  the  existence  of  a  higher-level  ML-like  syntax  for  CDSO  algorithms,  and  discuss 
an  algorithm  written  in  that  syntax.  It  is  implicitly  to  be  understood,  however,  that  we  are  really 
referring  to  the  CDSO  program. 

In  higher-level  syntax,  the  algorithm  looks  like  a  simple  sequential  version  of  the  min  function 
definition  from  Section  2.7.  We  choose  the  left  input  to  evaluate  first. 

algo  left_min  (nl,  n2)  = 
case  nl  of 
0  =>  0 

|  S(x)  =>  case  n2  of 
0  =>  0 

|  S(y)  =>  S(left_min(x,  y)) 
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The  algorithm  has  the  following  property: 

[left-min](Sn(0),Sp(0))  =  Smin(n’p)(0), 

so  it  does  compute  the  minimum,  and  it  works  in  time  0(min(n,p))  by  alternating  between  the 
inputs  and  examining  one  cell  at  a  time.  □ 

Note  that  the  algorithm  also  satisfies: 

[left-mini  (Sn(JL),Sp(l.))  = 

so  Colson’s  Proposition  2.7.1  fails  as  well  in  the  context  of  sequential  algorithms. 

The  key  difference  between  left_min  and  mina  is  illustrated  by  their  behavior  on  pairs  of  a 
totally  defined  and  a  partial  element,  such  as  (£n(J_),  S^O))  (they  agree  on  all  other  inputs): 

[left_min](5n(l),5n(0))  =  Sn{±) 

[mma]  (Sn(±),Sn  (0) )  =  Sn(  0) 

This  comparison  makes  it  clear  that  min  is  a  parallel  function:  it  must  evaluate  its  inputs  in  parallel 
in  order  to  be  able  to  determine  when  either  one  is  defined.  Also  note  that  [left_min]  fits  between 
min  and  MIN  in  the  pointwise  order. 

We  illustrate  the  behavior  of  left_min  with  the  aid  of  the  interpreter: 

#  left_min. (0ne,S_bot) ; 
request?  B; 

— >  1 

request?  (B.s); 

— >  0 
request?  ; 

#  left_min. (S_bot ,0ne) ; 
request?  B; 

— >  1 

request?  (B.s); 

— > 

request?  ; 

#  left_min. (S_omega_bot ,0ne) ; 
request?  B; 

— >  1 

request?  (B.s); 

— >  0 


3.3  CDSP 

We  now  consider  the  extension  of  CDSO  with  the  query  construct,  which  we  call  CDSP,  standing 
for  “CDS  Parallel.”  This  was  examined  in  detail  by  Brookes  and  Geva  [12]  from  a  denotational 
point  of  view.  We  are  more  interested  in  the  operational  aspect,  since  we  want  to  know  the  running 
time  of  programs  that  use  query.  Consequently,  we  look  at  the  changes  necessary  to  the  forest 
representation  and  semantics  in  order  to  accommodate  query. 

As  when  we  first  described  the  query  construct  in  Section  2.4,  we  shall  ignore  issues  related  to 
ensuring  that  consistent  inputs  lead  to  the  same  output,  and  issues  of  the  necessity  of  specifying 
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future  inputs  in  certain  cases  (such  as  for  curried  parallel  algorithms).  These  issues  are  not  related 
to  our  main  concerns;  we  refer  the  reader  to  [12]  for  the  extra  notation  required  to  handle  such 
problems. 


3.3.1  Forest  semantics  of  query 

We  extend  the  set  of  forest  instructions  by  queries,  with  the  following  general  form: 

Query  {(ci,ti), . . . ,  (cn,in)}  is 
{vn>--',vi n]  :  insi 

{%1  ?  •  *  *  j  %n}  *  iftSm 
end 


A  query  will  have  a  number  of  patterns,  each  one  of  which  is  a  vector  of  values  extended  with 
the  special  symbol  We  introduce  auxiliary  notation  to  talk  about  patterns  separately.  In 
general,  the  ith  pattern  will  have  the  form: 

{(cl^l)j  •  •  •  ?  (era?  ^n)}  is 

{vn  5  •  •  •  5  ^in}* 

Evaluating  a  pattern  involves  evaluating  all  the  cells  for  which  the  corresponding  pattern  position 
is  not  in  parallel,  and  verifying  that  the  values  match.  There  are  three  possible  answers: 

1.  The  values  match,  in  which  case  we  return  the  special  value  match . 

2.  There  is  at  least  one  value  that  does  not  match.  We  return  no  match. 

3.  We  do  not  have  enough  information  to  decide  if  we  have  a  match.  In  this  case  we  issue  a 
residual  pattern  which  asks  for  the  values  of  just  those  cells  whose  values  we  still  need  to 
know. 


This  is  summarized  in  the  following  evaluation  rule  for  patterns,  using  the  conventions  that  for  any 
v ,  C  v,  and  (tq, . . . ,  vn)  C  (uj, . . . ,  vl 2 3n)  when,  for  each  i,  V{  C  v\. 


xh  ?  Cl  -»•  v[,  ) 


(Pat) 


Xin  ?  cn 


f  □  (vu...,vn) 

^  and  <  (uj, . . . ,  v'n)  incomparable  to  («i, . . . ,  vn) 

,  («!>•• -X)  E 


{(Ci,n),...,(cn,«n)}  is  \ 
{vu...,vn}.  |  ‘  1 


■xnc 


match 
no  match 
residual  pattern 


When  executing  a  query,  we  will  evaluate  all  the  patterns  in  parallel.  There  are  also  three 
possibilities: 


1.  At  least  one  pattern  matches  (it  is  fine  if  several  patterns  match,  since  we  assume  outputs 
are  the  same  in  that  case),  in  which  case  we  execute  the  appropriate  instruction. 

2.  No  pattern  matches,  in  which  case  we  return  Cl. 

3.  Evaluation  of  all  patterns  results  in  residual  patterns.  In  that  case  we  return  a  residual  query 
by  putting  together  the  residual  patterns.  We  will  not  provide  details  on  constructing  such 
queries. 
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We  present  below  the  rule  for  evaluation  of  query  instructions.  Patterrik  stands  for  the  /cth 
pattern  in  the  query,  and  1  <  k  <  m. 


(Qry) 


Patterrik  ?  X\  *  *  *  xnd  match 
<  Vfc.  Patterrik  ?  aq  •  •  •  znc'  — »  no  match 

VAr.  Patterrik  ?  aq  *  ■  •  #nc'  — >■  residual  pattern 


Query  {(ci,*i), . . . ,  (cn,  in)}  is 
{un,...,uin}  :  mq 

Vmn }  •  iftSm 

end 


> 


ins/.  ?*!•••  xnd 

<  Q, 

residual  query 


3.3.2  CDSP  and  minimum 

The  addition  of  the  parallel  query  construct  enables  us  to  compute  mm,  which  is  essentially  a 
generalization  of  parallel-or  to  integer  arguments.  The  program  is  shown  in  Appendix  B.2.  Note 
that  it  is  actually  simpler  than  the  CDSO  program  for  left  .min.  Again,  for  the  purpose  of  the 
presentation,  we  use  a  higher-level  syntax.  In  that  syntax,  the  program  looks  almost  the  same  as 
the  definition  of  the  min  function  from  the  introduction: 

algo  min  (nl,  n2)  = 

query  (nl,  n2)  is 
(0,  -)  =>  0 
I  (-,  0)  =»  0 

I  (S(x),  S(y))  =»  S(min(x,  y)) 

The  program  is  clearly  efficient,  examining  two  cells  at  a  time.  We  then  obtain  the  following: 
Proposition  3.3.1  There  is  a  CDSP  program  computing  min . 


3.4  CDSO  versus  CDSP 

We  have  seen  that  both  CDSO  and  CDSP  can  compute  the  minimum  of  two  lazy  natural  num¬ 
bers  efficiently.  This  raises  the  question  of  whether  the  addition  of  deterministic  parallelism  to 
CDSO  buys  us  any  intensional  power.  There  actually  appears  to  be  a  folk  conjecture  that  deter¬ 
ministic  parallelism  is  not  “useful.”  The  claim  is  that  even  though  deterministic  parallel  features 
may  increase  the  extensional  expressiveness  of  a  language,  they  are  expensive  to  use  and  the  ad¬ 
ditional  expressiveness  is  not  useful  in  practice,  because  it  applies  only  to  computations  that  are 
unbounded.  In  our  terms,  the  claim  is  that  deterministic  parallelism  may  increase  extensional,  but 
not  intensional  expressiveness. 

This  conjecture  is  false.  Deterministic  parallelism  does  add  intensional  expressiveness.  The 
deterministic  query  construct  of  CDSP  is  sufficiently  general  to  allow  a  speedup  in  the  computation 
of  many  different  functions.  When  computing  certain  n-ary  functions,  the  query  construct  allows 
us  to  construct  a  tree  of  processes  of  logarithmic  depth.  We  illustrate  with  n-ary  disjunction. 

For  notational  simplicity,  we  define  a  separate  function  for  each  value  of  n,  and  we  assume  n  is 
a  (fixed)  power  of  2.  We  have  already  defined  por  for  two  arguments.  Here  is  the  general  case: 

algo  porn  (bi,  ...,  b„)  = 

por  (porn/2  (bi,  ...,  bn/2), 

POrn/2  C^n/2+1)  •••>  ^n)) 
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When  computing  porn,  in  order  to  fill  the  output  cell  we  query  in  parallel  two  cells.  In  order 
to  fill  those  cells,  we  query  two  more  for  each.  Intuitively,  after  a  depth  of  log  n  queries  we  reach 
our  n  inputs.  Therefore,  we  compute  the  result  in  time  0(log  n).  In  CDSO,  since  we  must  examine 
the  inputs  sequentially,  we  can  only  compute  the  result  in  time  0(n). 

We  can  formalize  this  by  instrumenting  our  operational  semantics  to  keep  track  of  depth  of  the 
computation.  We  only  do  this  for  result,  query,  valof,  application,  and  product,  as  the  others  are 
similar. 

The  new  rules  will  have  the  form  forest,  t  ?  c  -t  v,t',  where  t  stands  for  the  time  (or  depth) 
at  which  the  question  is  asked,  and  if  for  the  time  at  which  an  answer  is  issued.  The  modifications 
for  result  and  valof  are  simple. 


(Result’)  Result  v',  t  ?  x\  ■  ■  •  xnc'  — >■  v',t  +  1 


(Valof’) 


Xp,  t  ?  c  — >■ 


'  Vi,t' 

<  v,t' ,  and  Vi.  Vi^v 
tt,t' 

V 


Valof  (c,i)  is 

Vi  :  msi 

>  ,t  ?  • 

•  •  < 

f  msi ,  +  1  ?  ■  •  'Xnd 

,  +  1 

end 

[  output P~1  valof  c,  t'  +  1 

For  query,  the  difference  is  that  we  are  evaluating  the  patterns  in  parallel,  and,  within  each 
pattern,  the  cells  are  also  evaluated  in  parallel.  The  depth  of  a  pattern  evaluation  will  depend 
on  the  maximum  of  the  depths  of  its  sub-computations.  The  depth  of  a  successful  query  will  be 
the  minimum  of  the  depths  of  the  matching  pattern  evaluations;  an  unsuccessful  query  will  have  a 
depth  that  is  the  maximum  of  the  depths  of  all  pattern  evaluations. 


(Pat5) 


®il,<  ?  Cl  ->•  1 

f  K,-. 

•  }  3  5  *  •  *  5 

Vn) 

1 

... 

>  and  < 

•X)  2,2  (vv 

)  • • * ivn) 

?  C„  l£,tn  J 

1  W,.. 

•  ,<)E  (vi,..., 

Vn) 

{(cl?^l  )r  •■)(cn5^n)}  ^ s 

{vu...,vn}. 


tlx i  •  •  •  xnd 


match,  T 
<  no  match,  T 

residual  pattern,  T 


(Qry’) 


Patterrik ,  t  1  x\  •  •  •  xnd  — >  match,  tk 
<  Vfe.  Pattern tlx i  •  •  •  xncf  no  match,  t k 

Vfc.  Patternk,t  1  x\  •  •  -xnd  -»  residual  pattern, 


Query  {(ci,<i),  — }  is 
{vn,...,vin}  :  insi 

> 

{%i  j  •  •  •  5  Vmn }  •  iftSm  I 
end 


tlx i  •  •  •  xnd 


inSfc,Tmin  ?  X\  *  •  •  XnC 
^5  Tmax 

residual  query,  Tmax 


where  T  =  1  +  max{<i, . . .  ,tn},  Tmax  =  max{ti, . . .  ,tn},  and  Tmin  =  min{tfcl, . . .  for  all  k{ 

such  that  Patterned  1  x\  •  •  -rrnc'  ->  match, 4. . 


(Prod’)  JJ  Ai,t  ?  (c.i)  ->  Ai,t  +  1  ?  c 

i=l 

Example  3.4.1  Suppose  we  just  want  to  ask  a  question  of  a  ground  state.  Let  us  consider  the 
state  {B  =  tt}.  Its  forest  representation  is  Tree(B,  Result  tt),  so  we  have: 

{B  =  tt},  0  ?  B  -4  tt,  1. 

Example  3.4.2  ITe  have  already  seen  the  internal  representation  of  land  in  Figure  2.5.  We  have: 

land.{(B.  1)  =  tt,(B.  2)  =  tt},  0  ?  B—*tt,  4, 
because  there  are  two  valof  and  two  result  instructions  along  the  way. 

We  are  now  ready  to  prove  that  n-ary  disjunction  works  in  logarithmic  time. 

Proposition  3.4.3  porn{b\, . . . , bn), 0  ?  B  -+v,t,  where  t  <  41ogn. 

Proof:  By  induction  on  n,  which  is  always  a  power  of  2. 

In  the  base  case  we  have,  por(bi,  62),  0  IB  — >  v.  4,  since  we  execute  App’,  Query’,  Prod’, 
and  Result’.  But  4  =  4 log  2,  so  the  proposition  holds  for  n  =  2.  Suppose  it  holds  for  n.  Then 

por2n(porn(bi,...,bn),porn(bn+i,...,b2n)),0  ?  B  ->•  v,t, 

where  t  =  1  +  1  -I-  max{l  +  4  log  n,  1  +  4  log  n}  =  3  +  4  log  n.  But  4  log  2n  =  4  -I-  4  log  n  >  t.  □ 

So  we  have  established  that: 

Proposition  3.4.4  CDSP  is  intensionally  more  expressive  than  CDSO. 


3.5  Discussion 

The  sequentiality  of  the  primitive  recursive  algorithms  is  manifested  by  their  ability  to  recur  on 
only  one  input.  This  makes  them  “ultimately  obstinate,”  and  they  axe  not  able  to  express  an 
efficient  algorithm  for  minimum. 

The  sequentiality  of  Berry-Curien  algorithms  is  “by  design.”  A  sequential  algorithm  computes 
a  sequential  function,  by  only  choosing  one  sequentiality  index  at  a  time,  even  if  more  than  one 
exists.  However,  sequential  algorithms  are  more  expressive  than  primitive  recursive  algorithms: 
there  is  a  sequential  algorithm  that  computes  a  version  of  the  minimum  function  efficiently,  but 
not  the  “natural,”  inherently  parallel,  minimum  function. 
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The  addition  of  deterministic  parallelism  to  CDSO  allowed  us  to  compute  the  “natural”  version 
of  the  minimum  function,  but  CDSO  was  already  able  to  express  an  efficient  minimum  algorithm. 
However,  the  addition  of  deterministic  parallelism  did  add  intensional  expressiveness,  contradicting 
a  folk  conjecture.  The  computation  of  a  number  of  functions  can  be  speeded  up,  such  as  n-ary 
disjunction. 

Note  that  there  is  a  certain  sense,  however,  in  which  our  comparison  of  CDSO  and  CDSP  is 
not  “fair.”  It  is  possible  to  imagine  parallel  evaluation  strategies  for  CDSO  (cf.  Curien  [26]).  Such 
parallel  evaluation  would  not  work  well  with  CDS02,  but  in  CDS01,  with  its  tables,  we  could  have 
eager  computation  which  fills  the  table  without  waiting  for  a  question.  This  could,  of  course,  lead 
to  a  lot  of  useless  computation,  so  it  may  be  possible  that  we  get  good  time-efficiency,  but  poor 
work-efficiency.  We  return  to  this  point  in  the  concluding  chapter. 


Chapter  4 

Circuit  Semantics 


This  chapter  consists  of  two  parts,  both  concerned  with  establishing  relative  intensional  expressive¬ 
ness  results  for  parallel  extensions  of  PCF,  and  both  utilizing  circuit  semantics  as  the  main  tool. 
Circuit  semantics  associates  a  gate  with  each  basic  construct  of  the  language,  and  takes  the  mean¬ 
ing  of  a  program  to  be  a  circuit.  The  dimensions  of  the  circuit  enable  reasoning  about  running  time 
and  work  required  for  execution.  In  the  first  part  of  the  chapter,  we  compare  four  deterministic 
extensions  of  PCF:  parallel-or,  parallel  conditionals  on  booleans  and  integers,  and  deterministic 
query  [12].  To  aid  us  in  this  comparison  we  introduce  a  naive  version  of  the  circuit  semantics  (first 
reported  in  [13]),  which  enables  us  to  talk  about  relative  depth  of  an  implementation.  This  notion 
is  good  enough  to  produce  a  hierarchy  of  intensional  expressiveness:  query  is  the  most  powerful, 
followed  by  parallel  conditional  on  integers,  while  parallel-or  and  parallel  conditional  on  booleans 
are  equivalent  and  the  weakest. 

In  the  second  part  of  the  chapter,  we  compare  deterministic  query  with  a  nondeterministic 
version  (first  presented  in  [27]).  We  refine  the  circuit  semantics  to  allow  us  to  talk  about  parallel 
time  and  parallel  work  required  for  execution,  and  we  establish  connections  between  the  size  and 
depth  of  a  circuit  representing  a  parallel  PCF  program  to  the  time  and  work  required  to  execute  it 
under  call-by-speculation  [49],  parallel  call- by- value,  and  parallel  eager  evaluation.  We  also  relate 
the  circuit  dimensions  of  a  program  to  the  time  and  number  of  processors  required  to  execute  it  in 
the  PRAM  model. 

In  order  to  be  able  to  compare  the  two  versions  of  query,  we  are  forced  to  make  a  hardware 
assumption  which  is  equivalent  to  having  the  ability  to  detect  undefined  inputs.  This  makes  a 
subset  of  the  programs  using  nondeterministic  query  return  a  deterministic  result.  The  assumption 
is  reasonable  from  a  practical  point  of  view  and  has  been  used  in  various  studies  of  consensus 
problems  in  distributed  systems  [34].  The  effect  of  this  assumption  is  to  render  our  problem 
similar  to  one  from  computational  complexity,  that  of  comparing  monotone  and  De  Morgan  boolean 
circuits.  It  turns  out  that  parallel  PCF  programs  are  intensionally  equivalent  to  boolean  circuits 
for  a  certain  class  of  functions  involving  undefined  inputs.  This  connection  allows  us  to  use  strong 
results  from  complexity  theory  to  establish  intensional  expressiveness  results. 

Section  4.1  describes  the  slightly  different  version  of  PCF  we  are  using,  and  the  deterministic 
parallel  extensions  we  shall  be  comparing.  A  first  version  of  the  circuit  semantics  is  introduced  in 
Section  4.2  and  is  used  to  obtain  the  first  intensional  separation  results  in  Section  4.3.  Section  4.4 
describes  the  recursion-free  version  of  PCF  we  use  for  the  nondeterministic  extension,  and  intro¬ 
duces  the  nondeterministic  query.  The  circuit  semantics  is  refined  in  Section  4.5  and  the  method 
for  making  the  comparison  between  determinism  and  nondeterminism  is  outlined  in  Section  4.6. 
Section  4.7  shows  a  connection  between  our  question  and  circuit  complexity,  which  is  used  in  Sec- 
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tion  4.8  to  obtain  the  separation  of  deterministic  and  nondeterministic  query.  Finally,  Section  4.9 
provides  a  discussion  of  the  results. 

4.1  PCF  and  deterministic  parallel  extensions 

4.1.1  PCF 

In  addition  to  the  standard  PCF  constants  listed  in  Figure  2.1,  we  assume  the  existence  of  a 
constant-time  equality  test  for  integers: 

=  :  i  — >  l  — *  o 

with  the  obvious  operational  semantics.  Traditionally,  the  equality  test  is  implemented  using  re¬ 
cursion  (cf.  [81]),  but  this  would  render  some  of  the  issues  of  interest  to  us  moot.  The  reason 
for  this  is  that  in  what  follows  we  will  want  to  know  when  one  construct  can  be  implemented  in 
terms  of  others  without  using  recursion.  Since  we  are  not  using  integers  in  unary  representation, 
it  would  be  unreasonable  to  have  to  use  recursion  to  check  for  equality.  In  fact,  a  more  realistic 
logarithmic-time  test  would  not  invalidate  our  results;  we  chose  a  constant-time  test  because  it  is 
simpler. 

Let  FV  ( M )  stand  for  the  set  of  free  variables  of  term  M.  If  FV  ( M )  =  0  then  the  term  M  is 
closed ,  else  it  is  open.  The  closed  terms  of  ground  type  are  referred  to  as  programs. 

4.1.2  Parallel-or  and  parallel  conditionals 

The  parallel  extensions  of  PCF  studied  in  Plotkin’s  seminal  paper  [70]  are:  por,  pif0  (parallel 
conditional  on  booleans),  and  pifL  (parallel  conditional  on  integers).  The  extension  of  PCF  with 
any  of  these  functions  is  fully  abstract  with  respect  to  the  standard  denotational  semantics.  The 
parallel  conditionals  are  defined  as  follows: 

pifa  :  o  -*  a  — >  a  — >  a 
pife  T  x  x  =  x 
pifv  tt  x  A  —  x 
pifa  ff  ±x  =  x 

for  a  =  i,o.  Por ,  pif0,  and  pifL  are  known  to  be  extensionally  equivalent  [26,  81],  i.e..  one  can 
be  implemented  in  terms  of  another.  The  question  we  address  is  whether  they  are  intensionally 
equivalent.  Interestingly,  the  answer  turns  out  to  be  negative. 

4.1.3  Query 

Another  parallel  deterministic  extension  we  are  interested  in  exploring  is  query.  We  have  already 
encountered  query  in  the  context  of  concrete  data  structures,  but  the  construct  is  quite  general, 
and  in  this  chapter  we  use  it  to  extend  PCF.  Figure  4.1(a)  shows  the  PCF- like  syntax  we  envision 
for  query  in  yet  another  example  of  a  program  for  parallel-or. 

The  general  form  of  the  query  syntax  is  shown  in  Figure  4.1(b).  The  x?1  (cr ,  €  {o,  t})  are 
variables,  the  pi  are  patterns,  and  the  MJ  are  PCF  terms.  A  pattern  is  a  vector  of  length  n  (the 
number  of  variables  in  the  query),  with  each  element  being  either  a  variable  y,  a  closed  term  e  of 
ground  type,  or  the  “don’t  care”  symbol  All  of  the  variables  in  one  pattern  must  be  distinct. 

We  distinguish  between  two  versions  of  the  query  construct:  deterministic  and  nondeterministic. 
In  the  first  case,  we  shall  require  the  same  output  for  all  consistent  inputs.  Let  p\  =  (iq, . . . ,  vn ) 
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por  =  A xy.  query  ( x ,  y)  is  query  :  (a\  x  •  •  •  x  an)  — »  r 

(tt,  _)  tt  query  ( x J1 , . . . ,  )  is 

I  (_)  tt)  =>■  tt  pi  =>  Mf 

pk  => 


Figure  4.1:  (a)  parallel-or,  (b)  query  syntax 


and  ])‘>  =  (u>i, . . .  ,  u;n)  be  two  patterns.  We  call  the  two  patterns  consistent  (written  p\  f|-  pf)  if  Vi. 
either  V\vi\  C  Pjiyj  or  V\vi\  □  Since  the  elements  of  patterns  come  from  flat  domains, 

this  formulation  of  consistency  coincides  with  the  conventional  notion  of  “having  an  upper  bound.” 
We  extend  the  standard  semantics  with  X>[_]p  =  -L,  and  note  that  a  variable  in  a  pattern  will 
always  be  consistent  with  anything,  since  it  is  the  equivalent  of  a  “don’t  care.” 

Example  4.1.1  We  present  some  examples  of  consistent  and  inconsistent  patterns.  We  have 

(_,  1)  It  (0,_),  (_,y)  It  OM)  It  (0,y),  and  also  (x,l)  1i  (0,*). 

On  the  other  hand, 

(0,1)  #(0,0)  and  (_,l)tf  (0,0). 

A  deterministic  query  has  the  property  that  it  produces  the  same  output  for  all  consistent 
inputs,  i.e.,  given  two  patterns  Pi,pj,  if  pi  #  pj  then  V[Mj\  =  V\Mj\.  The  determinism  restriction 
makes  it  fairly  easy  to  define  a  semantics,  shown  in  Figure  4.2.  We  use  Q  to  refer  to  the  general 
form  of  query,  from  Figure  4.1(b).  Amb  is  McCarthy’s  ambiguity  operator  [62]: 

ara&(_L,  x )  =  x ,  amb(x ,  J_)  =  x ,  amb(x ,  y)  =  x  or  y, 

which  behaves  essentially  like  a  parallel-or  when  only  one  argument  is  defined,  and  performs  an 
arbitrary  choice  between  the  arguments  if  both  are  defined.  Arnbk  is  &-ary  amb .  Because  of  the 
determinism  constraints,  it  will  always  be  the  case  that  amb  will  behave  deterministically,  Le.,  if 
we  have  amb(x ,  y )  with  both  x ,  y  defined,  then  x  —  y.  Consequently,  the  meaning  of  a  query  will 
be  a  continuous  function. 

We  uSe  the  notation  x  for  and  |  for  concatenating  environments,  with  the  simple 

properties:  p  |  _L  =  _L  |  p  =  p.  It  is  not  possible  to  have  multiple  bindings  for  the  same  variable, 
because  of  our  requirement  that  all  variables  in  one  pattern  should  be  distinct.  A  is  parallel-and, 
with  the  properties:  tt  A  tt  ~  tt,  ff  A  _L— jff,  _L  A  ff  =  ff  - 

The  auxiliary  semantic  function  Vpat  defines  the  meaning  of  a  pattern  match.  It  keeps  track  of 
whether  the  pattern  match  succeeds  and  of  any  bindings  generated  in  the  process.  A  pattern  match 
succeeds  when  each  element  of  the  pattern  matches  its  corresponding  input.  Since  we  can  have 
variables  in  the  patterns,  we  may  generate  bindings.  The  results  of  element-wise  matching  com¬ 
parisons  are  combined  with  parallel-and,  and  the  environment  extended  with  any  newly  generated 
bindings. 

We  work  out  an  example  in  detail  in  order  to  illustrate  the  semantics.  Consider  the  following 
query: 
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Vpat  :  Patterns  -4  Environments  ->(Dbooi  x  Environments) 


T>lQ}p  =  ambfc  (£>[:?  is  p\  ^  M{jp, V\x  is  pk  =»  M£}p) 

V[MT]{p  |  p),  if  Vpatfx  is  pjp  =  (tt,p) 

%  -L,  if  is  pjp  =  (If,  p) 

■ . , <")  is  (e^1 ,  •  •  • ,  e°n)}p  =  (61  A  ■  •  •  A  6„,  pi  |  •  •  •  |  pn), 
where  (6*,  /z*)  =  is  ef*]p 

(D[rr°']/9  =  X>[e<7]/t),  X),  if  ea  is  closed 
( tt ,  yc  i-4  xa),  if  ea  =  y^ 

(if,  X),  if  ea  =  _ 


Dp;  is  p 
Ppat\(x 


cn 

l  >• 


Ppat  [ 


pot  l^*7  is  eaJ 


Figure  4.2:  Denotational  semantics  for  deterministic  query 


Q  =  query  (xux2)  is 

(*,  1)  =»  Mi 

I  (0,y)  =>  M2. 


According  to  the  semantics,  we  have: 

V[Q\p  =  axab(V\xiX2  is  (x,  1)  =$>  Mi]p,V[xix2  is  (0,y)  =»  M2}p). 


The  arguments  to  the  ambiguity  operator  are: 


V \x\X2  is  (x,  1)  =>  Mijp  = 


V[Mi\{p  |  x  i-4  ®i),  if  p(x 2)  =  1 
X,  otherwise. 


V[xiX2  is  (0,  y)  =*>  M2]p  =  |  I  y 


if  p{x  1)  =  0 
otherwise. 


One  can  easily  see  why  having  variables  in  the  patterns  is  equivalent  to  a  “don’t  care”;  the  only 
difference  is  that  the  environment  p  gets  extended  with  a  new  binding.  The  above  equations  were 
obtained  with  the  aid  of  the  T>pat  semantics: 


'Hpatfx ix2  is  (x,  l)]p  =  (p(x 2)  =  l,x  1-4  aq),  because 
Vpatlxi  is  x\p  =  ( tt,x  1-4  xi) 

Ppat I® 2  is  1  \p  =  (p(x 2)  =  1,  -L) 

T>pat\x  1X2  is  (0,  y)]p  =  (p(x  1)  =  0,y  ^4  x2),  because 
Vpatlxi  is  0]p  =  (p(x  1)  =  1,X) 

Ppat I®2  is  yjp  =  (tt,yy->  x2) 

It  should  be  noted  that  since  the  two  patterns  are  consistent  (c/.  Example  4.1.1),  we  must  have 
V\M\\  =  D[M2J  if  our  query  is  to  be  deterministic. 
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Figure  4.3:  (a)  Xx.  M,  (b)  (Aa:.  M)N,  (c)  ~DLb  x  y 

4.2  Circuit  semantics:  first  approach 

In  order  to  compare  por ,  pifQ ,  and  pifL,  we  find  it  useful  to  view  PCF  programs  as  circuits.  There 
are  several  reasons  for  this.  First,  it  enables  us  to  reason  based  on  the  last  gate  used  in  the  circuit. 
Viewing  a  program  as  a  circuit  reduces  the  number  of  cases  we  need  to  consider.  Second,  the 
running  time  of  the  program  loosely  corresponds  to  the  depth  of  the  circuit.  At  this  stage,  we  are 
only  interested  in  the  depth  of  programs,  i.e closed  terms  of  ground  type,  so  we  need  not  worry 
about  complications  caused  by  higher-order  terms.  Also,  a  loose  correspondence  is  fine,  since  we 
only  need  to  distinguish  programs  that  use  recursion  from  programs  which  do  not.  And  third, 
circuits  provide  a  visual  and  intuitive  semantics.  This  is  more  than  a  cosmetic  point:  viewing 
programs  as  circuits  enables  us  to  find  the  connection  with  boolean  circuits  in  the  second  part  of 
this  thesis. 

The  translation  from  PCF  to  circuits  is  simple.  Figure  4.3  shows  circuits  for  function  definition, 
application,  and  a  constant.  A  function  denotes  a  circuit  some  of  whose  inputs  are  labelled  with 
variables.  Application  substitutes  a  value  for  a  variable,  or,  if  we  have  a  whole  circuit,  connects  its 
output  to  the  respective  variable- labelled  input.  Note  that  higher-order  functions  can  be  treated 
in  this  framework  as  well,  by  using  gates  labelled  with  the  function  variable  inside  the  circuit  (for 
an  example,  see  Figure  4.8  in  Section  6).  There  are  gates  for  the  various  constants.  The  only 
interesting  case  is  the  Y  combinator.  It  gives  rise  to  a  special  kind  of  circuit,  a  dynamic  circuit , 
which  can  have  subparts  expanded  dynamically  as  required  during  computation. 

The  semantics  of  circuits  is  based  on  PCF’s  operational  semantics.  Execution  is  demand-driven 
and  begins  at  the  output.  The  last  gate  in  the  circuit  is  activated.  This  gate  may  start  evaluating 
one  (or  more,  if  it  is  parallel)  of  its  inputs,  leading  to  activity  at  further  gates,  and  so  on.  If  the 
computation  terminates,  the  result  will  filter  down  to  the  output  of  the  last  gate. 

Definition  4.2.1  A  circuit  is  static  if  it  is  the  translation  of  a  non-recursive  PCF  program. 

Definition  4.2.2  A  circuit  is  dynamic  if  it  is  the  translation  of  a  recursive  PCF  program . 

A  circuit  could  have  several  inputs,  but  it  always  has  just  one  output,  so  it  is  shaped  as  a  tree. 

Definition  4.2.3  The  depth  of  a  static  circuit  is  equal  to  the  height  of  the  underlying  tree . 

Definition  4.2.4  A  circuit  is  constant- depth  if  it  is  either  static ,  or  a  dynamic  circuit  which  does 
not  expand  more  than  a  fixed  constant  number  of  times  (independent  of  the  inputs). 
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n  x  n  x 


Figure  4.4:  YF 

Example  4.2.5  To  give  an  example  of  dynamic  circuits,  and  to  illustrate  the  difference  between 
constant- depth  and  non- constant- depth  dynamic  circuits,  consider  the  following  PCF  term: 

F  =  A fnx.  Dt  (=  n  3)  x  (f  (+1  n)  x). 

Figure  4-4  shows  the  circuit  denoted  by  the  recursive  PCF  term  YF.  We  enclose  a  dynamic  circuit 
in  a  box  with  dotted  lines,  to  represent  the  fact  that  it  can  be  expanded.  The  box  is  labelled  with  the 
name  of  the  recursive  part.  The  result  of  expanding  the  circuit  once  is  shown  in  Figure  4-5. 

The  program  YF  n  for  0  <  n  <  3  gives  rise  to  a  constant- depth  dynamic  circuit,  while  for  n  >  3 
it  results  in  a  non- constant- depth  dynamic  circuit. 

In  the  following,  we  are  particularly  interested  in  the  constant-depth  circuits.  If  two  functions 
can  be  implemented  in  terms  of  each  other  with  constant-depth  circuits,  we  say  that  the  two 
functions  are  intensionally  equivalent. 


4.3  Intensional  separation  for  deterministic  extensions 

4.3.1  pifL  versus  por  and  pifQ 

We  begin  by  reviewing  known  implementations  of  the  various  functions. 

Proposition  4.3.1  por  and  pif0  are  intensionally  equivalent. 

Proof:  We  need  constant-depth  implementations  of  one  in  terms  of  the  other.  This  can  be  done 
as  follows  (cf.  [81]): 

por  =  A xy.  pif0  x  tt  y , 
pifo  =  Xbxy.  por  ( pand  b  x) 

(pand  ( not  b)  y) 

(pand  x  y), 

where  pand  is  the  parallel  conjunction  defined  by: 

pand  =  A  xy.  not  (por  (not  x)  (not  y)), 

and  we  have  generalized  por  to  three  arguments  in  the  obvious  way.  □ 

It  is  known  that  pift  can  implement  pifQ  (cf.  [81]): 
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n  x  n  x 


Figure  4.5:  YF  expanded  once 
Pif0  =  Mxy.  (=  1  (pifL  b  (DL  x  1  0)  (Dt  y  1  0))). 

This  implementation  is  also  efficient.  In  view  of  the  previous  proposition,  it  follows  that  pifL  can  also 
implement  por  efficiently.  However,  the  converse  is  false.  The  problem  is  that  por  can  only  start 
parallel  subcomputations  on  booleans,  whereas  pifL  operates  in  parallel  on  integers.  The  standard 
way  of  encoding  pifL  with  por  uses  recursion  (cf  [81]): 

pifL  =  Fi^O,  where 

F  =  Xfnbxy.  Dc  ( por  (pand  (=  x  n)  (=  y  n)) 

( pand  b  (=  x  n)) 

(pand  ( not  b)  (=  y  n))) 
n 

(/  (+1  n)  b  x  y). 

This  is  clearly  inefficient,  because  of  the  way  the  recursion  unwinds,  checking  if  x  and  y  are  equal 
to  0  first,  then  1,  and  so  on.  But  we  cannot  do  any  better.  To  show  that,  we  prove  first  two  lemmas 
which  restrict  the  shape  of  any  program  computing  pift. 

The  point  of  the  first  lemma  is  that  it  is  impossible  to  design  boolean  circuitry  B  which  chooses 
between  x  and  y  and  obeys  all  the  requirements  of  pifL. 

Lemma  4.3.2  It  is  not  possible  to  write  a  program  in  PCF  +  por  that  computes  pifL  b  x  y  and  is 
of  the  form  Dt  B  x  y,  where  B  is  a  static  circuit  yielding  a  boolean. 

Proof:  Without  loss  of  generality,  the  issue  is  whether  it  is  possible  to  write  a  PCF  +  por  function 
B  with  the  following  properties: 

1.  If  b  is  tt  then  B  is  tt , 
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b 

(=xy) 

B 

tt 

A 

tt 

ff 

A 

ff 

T 

tt 

tt 

Table  4.1:  Requirements  for  function  B 


2.  If  b  is  ff  then  B  is  ff , 

3.  If  (=  x  y)  is  tt  then  B  is  tt. 

Figure  4.1  shows  some  of  the  inputs  and  corresponding  outputs  for  function  B.  For  simplicity,  we 
assume  only  b  and  (=  x  y )  are  used  in  evaluating  B.  The  same  argument  can  be  carried  through 
with  additional  inputs,  since  b  and  (=  x  y)  must  be  used  in  evaluating  B. 

The  last  line  of  Table  4.1  implies  by  monotonicity  that  B  fftt  =  tt.  But  this  violates  the 
monotonicity  condition  raised  by  the  second  line.  Therefore,  no  program  of  this  form  computes 
pift  bxy.  U 

Our  second  lemma  generalizes  the  first  one. 

Lemma  4.3.3  It  is  not  possible  to  write  a  program  in  PCF  +  por  that  computes  pifL  bxy  and  is 
of  the  form  Dt  B  Ni  1V2,  where  B,  N\,  jV2  are  static  circuits  yielding  a  boolean  and  two  integers 
respectively. 

Proof:  Intuitively,  there  are  two  possibilities  for  B:  either  it  “chooses”  between  N\  and  AT2,  or 
it  is  “hardwired”  to  always  pick  one  of  them.  More  precisely,  we  have  two  cases  for  the  function 
computed  by  B: 

1.  B  is  non-constant.  Since  the  program  computes  pift  bxy ,  the  result  must  be  either  x  or  y. 
There  are  an  infinite  number  of  possible  inputs  and  outputs  and  N\,  A'?  are  static  circuits, 
so  it  is  not  possible  to  hard-code  the  output.  B  will  sometimes  return  tt  and  sometimes  ff. 
There  are  then  three  choices  for  what  N\,  AT2  evaluate  to: 

(a)  They  evaluate  to  x,  y,  respectively.  But  this  is  impossible  by  Lemma  4.3.2. 

(b)  They  both  evaluate  to  pif  bxy.  The  Dt  gate  then  does  no  work.  Since  Ni,  No  both 
compute  something  of  type  integer,  there  are  essentially  two  cases  for  the  last  gate  used 
in  their  construction:  (i)  Dt  or  (ii)  +1  (—1  is  handled  similarly).  In  case  (i)  apply  the 
same  reasoning  of  this  lemma.  There  cannot  be  an  infinite  sequence  of  DL  gates  which 
do  nothing,  since  the  circuit  is  static.  It  is  not  possible  for  all  D,  gates  to  do  nothing 
since  the  output  would  then  have  to  be  constructed  out  of  +1,  —1,  and  the  integers, 
so  it  would  either  be  hard-coded  (and  it  must  work  for  an  infinite  number  of  values), 
or  produce  a  fixed  offset  from  x  or  y.  The  latter  case  is  analogous  to  case  (la)  above, 
except  that  the  branches  evaluate  here  to  a  fixed  offset  of  x  or  y;  the  same  reasoning 
applies.  In  case  (ii)  there  cannot  only  be  +1  (or  —1)  gates  for  the  reason  outlined  above. 
Also,  there  can  only  be  a  constant  number  of  +1  (or  —1)  in  a  row  before  some  Dt  is 
reached,  whereupon  we  can  apply  the  lemma  again.  By  the  same  reasoning  we  must  at 
some  point  encounter  case  (la)  of  the  proof. 
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(c)  One  evaluates  to  pifL  b  x  y  and  the  other  to  x  or  y.  We  apply  the  same  reasoning  as 
in  case  (lb)  to  the  last  gate  in  the  branch  evaluating  to  pifL  b  x  y,  eventually  reaching 
case  (la). 

2.  B  is  constant.  That  means  that  either  N\  or  N%  must  compute  pifL  b  x  y.  Again  we  have 
a  Dl  gate  which  does  no  work.  Without  loss  of  generality,  assume  B  is  ft,  so  Ni  always 
gets  chosen.  We  apply  the  same  reasoning  as  in  case  (lb)  to  the  last  gate  in  N\  eventually 
reducing  the  problem  to  case  (la). 

So  our  circuit  cannot  be  filled  with  gates  which  “do  no  work.”  At  some  point  there  must  be  a  DL 
which  essentially  attempts  to  choose  between  x  and  y.  But  that  is  impossible  by  Lemma  4.3.2. 
Therefore,  our  pifL  program  cannot  have  even  this  more  general  form.  □ 

Now  we  are  ready  to  prove  the  main  result  of  this  section. 

Proposition  4.3.4  PCF  +  por  cannot  implement  pifL  with  a  constant- depth  circuit. 

Proof:  Assume  there  exists  a  constant-depth  circuit  computing  pifL.  There  are  two  possibilities: 

1.  Static  circuit.  The  result  has  type  integer.  Therefore,  there  are  two  cases  for  the  last  gate  in 
the  circuit: 

(a)  Dl.  By  Lemma  4.3.3  this  is  not  possible. 

(b)  +1  or  —1.  The  circuit  cannot  be  constructed  entirely  out  of  +1,  —  1,  integers,  x ,  y, 
because  the  result  would  be  either  hard-coded  (and  it  must  work  for  an  infinite  number 
of  values),  or  a  fixed  offset  of  x  or  y.  Also,  since  the  circuit  is  static,  there  can  only  be 
a  constant  number  of  +1  or  —1  in  a  row  before  reaching  an  occurrence  of  Dt.  Then  we 
have  essentially  the  same  situation  as  in  case  (la)  (modulo  some  fixed  offset,  as  in  the 
proof  of  Lemma  4.3.3),  and  by  the  same  argument  the  circuit  cannot  implement  pifL. 

2.  Dynamic  circuit.  We  want  to  show  that  the  circuit  cannot  be  constant-depth.  Assume,  for  a 
contradiction,  that  there  is  a  fixed  maximum  constant  depth  beyond  which  the  recursion  does 
not  get  unwound,  regardless  of  the  inputs  6,  y.  Then  there  are  only  finitely  many  constant- 
depth  circuits  which  could  be  the  result  of  the  unwinding.  But  there  are  infinitely  many 
possible  inputs.  Therefore,  at  least  one  of  these  circuits  must  work  for  infinitely  many  inputs. 
Apply  the  same  reasoning  on  that  circuit  as  in  case  (1)  of  this  proof.  We  can  assume  there  is 
no  other  recursion,  otherwise  continue  the  argument  on  the  innermost  recursion,  which  must 
exist  because  of  the  constant-depth  assumption.  Therefore,  there  is  no  fixed  maximum  depth 
for  unwinding  the  recursion  computing  pifL. 

In  conclusion,  it  is  impossible  to  write  a  constant-depth  program  using  por  to  compute  pift , 
therefore  por  and  pifL  are  not  intensionally  equivalent.  □ 

4.3.2  Query  versus  pifL 

In  order  to  compare  query  to  the  other  constructs,  we  need  to  make  finer-grained  distinctions  than 
those  in  the  previous  section,  and  consequently,  circuit  semantics  is  no  longer  useful  in  the  form  we 
have  presented  it.  The  problem  is  due  to  the  fact  that  we  have  a  mixture  of  sequential  and  parallel 
constructs  in  the  language  and  circuit  semantics  is  an  inherently  parallel  semantics.  For  a  non¬ 
parallel  evaluation  strategy  this  implies  that  the  running  time  of  a  program  does  not  correspond 
closely  to  its  circuit  depth.  To  make  the  comparisons  in  this  section  we  need  to  extend  PCF’s 
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operational  semantics  with  a  notion  of  running  time.  We  omit  the  details,  but  note  that  the  results 
of  this  section  only  apply  to  evaluation  strategies  that  are  not  parallel  on  the  sequential  constructs 
of  PCF.  We  return  to  this  point  in  the  last  chapter  of  the  thesis. 

To  see  that  query  is  more  powerful  consider  the  implementation  of  an  n-ary  function,  such  as 
n-ary  addition.  Assume  the  existence  of  an  addition  operation  (+),  so  we  can  write  sequential 
addition  without  having  to  use  recursion:  add2  =  A xy.  x  +  y.  We  can  implement  binary  addition 
with  query  as  follows: 

padd  =  \x\x2.  query  (x\,x2)  is 

(v1:v2)  =>Vi  +  V2 

Note  that  the  addition  of  v\  and  v2  is  performed  sequentially  (this  +  is  sequential,  not  bitwise- 
parallel).  This  is  not  essential.  What  is  important  is  that  separate  processes  are  started  to  evaluate 
the  inputs.  Thus,  we  can  implement  n-ary  addition  in  depth  log  n  by  constructing  a  tree  of  binary 
additions. 

So  the  question  we  are  concerned  with  is  whether  pift  can  also  implement  n-ary  addition  effi¬ 
ciently.  The  answer  is  no.  The  problem  is  that  even  though  pifL  can  start  parallel  subcomputations 
to  evaluate  two  integers,  it  must  return  one  of  them.  There  is  no  way  to  combine  the  results  of  the 
subcomputations.  Only  a  limited  amount  of  communication  exists  between  the  subcomputations: 
a  check  for  equality  of  their  results. 

Proposition  4.3.5  PCF  -h  pifL  cannot  implement  n-ary  addition  in  depth  log  n. 

Proof:  We  identify  a  property  that  holds  for  our  query  program,  padd ,  and  show  that  it  does  not 
hold  for  programs  of  PCF  +  pifr  In  padd  the  inputs  are  evaluated  in  parallel  and  the  result  is 
their  sum.  In  PCF  +  pift,  the  only  parallel  primitive  is  pifL  so  the  inputs  x  and  y  must  go  through 
some  pifL  if  they  are  to  be  evaluated  in  parallel.  Suppose  x  goes  through  pifL  after  passing  through 
some  constant-depth  circuit  computing  F  and  similarly  for  y  and  a  function  G.  Then  the  output 
of  the  pifL  is  either  F(x)  or  G(y).  If  either  F(x )  =  x  +  y  or  G(y)  =  x  +  y,  then  the  addition  was 
performed  sequentially  before  the  pifL.  If  the  output  of  pifb  goes  into  some  constant-depth  H  such 
that  H(F(x))  =  x  +  y  or  H(G(y))  —  x  +  y  then  the  addition  was  also  performed  sequentially,  this 
time  after  the  pifL.  So  it  is  not  possible  to  compute  x  +  y  using  pifL  in  such  a  way  that  x  and  y 
are  evaluated  in  parallel.  Therefore,  a  PCF  +  pifL  program  for  n-ary  addition  must  be  of  depth  at 
least  n.  □ 

As  a  corollary  of  the  previous  two  propositions,  we  have  the  following: 

Proposition  4.3.6  PCF  +  por  cannot  implement  n-ary  addition  in  depth  log  n. 

In  light  of  these  results,  we  have  the  emergence  of  a  picture  of  different  levels  of  intensional 
expressiveness  for  deterministic  parallel  constructs:  At  the  lowest  level  we  have  por  and  pi/0,  which 
seem  to  be  able  to  speed  up  only  n-ary  boolean  functions.  At  the  next  level  we  have  pift ,  which 
can  be  used  to  speed  up  some  integer  functions.  Finally,  at  the  top  level  we  have  query,  which  can 
be  used  to  speed  up  n-ary  addition. 

4.4  Comparing  deterministic  and  nondeterministic  query 

A  natural  question  to  ask,  after  the  results  of  the  previous  section,  is  whether  relaxing  the  determin¬ 
ism  constraint  on  the  deterministic  query  gives  us  an  increase  in  intensional  expressiveness.  This 
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turns  out  to  be  a  difficult  question.  In  order  to  answer  it  we  are  forced  to  make  some  concessions: 
First,  we  consider  a  recursion-free  version  of  PCF,  since  the  recursive  part  does  not  mesh  well  with 
our  interpretation  for  nondeterministic  query.  This  is  not  that  important,  however,  since  we  shall 
be  making  a  connection  with  boolean  circuits,  which  are  not  recursive.  Second,  we  make  a  hard¬ 
ware  assumption,  in  order  to  render  the  result  of  a  subset  of  nondeterministic  queries  deterministic. 
This,  of  course,  implies  that  we  are  not  really  comparing  a  deterministic  and  a  nondeterministic 
construct,  but  rather  two  different  machine  models.  We  also  need  to  go  back  and  revisit  the  circuit 
semantics  in  order  to  obtain  a  precise  correspondence  between  the  dimensions  of  a  circuit  and  the 
running  time  and  work  required  to  execute  a  program.  We  begin  by  introducing  the  modified 
language  and  nondeterministic  query. 

4.4.1  Recursion- free  PCF 

The  big  departure  from  the  previous  description  of  PCF  is  the  lack  of  recursion.  However,  since 
we  still  need  to  talk  about  undefined  (or  “missing”)  inputs,  we  introduce  “undefined”  constants 
Qa,  one  for  each  type  a.  We  also  expand  somewhat  the  set  of  arithmetic  constants,  but  this  is  not 
essential.  The  new  set  of  constants  we  will  consider  is: 

tt,ff  :  o  +,  —  : 

n  :  l  D<r  :  o  cr —>  cr ->  a  (a  E  {o,  l}) 

=,  <,  >,  >  :  i  — ^  i  — >  o  fla  :  a  (undefined  elements) 

4.4.2  Nondeterministic  query 

We  drop  the  consistency  requirement  on  the  various  outputs  of  a  query.  As  an  example  of  the 
programs  we  can  write  now,  here  is  one  that  turns  out  to  be  important  in  what  follows: 

not_ i  =  Ax. query  (x)  is 
tt  ff 

ff^ff 

The  semantics  we  presented  earlier  for  deterministic  query  (Figure  4.2)  also  makes  sense  for  non¬ 
deterministic  query,  but  now  permits  the  amb  operator  to  be  presented  with  distinct  defined  inputs. 
Under  this  interpretation,  however,  programs  no  longer  compute  functions,  but  relations.  Given 
the  fact  that  our  definition  of  intensional  expressiveness  requires  the  computation  of  functions,  we 
would  have  to  restrict  the  indeterminacy  to  the  inside  of  a  program.  The  following  proposition 
shows  that  we  cannot  do  that  in  any  useful  way. 

Proposition  4.4.1  Under  the  semantics  of  Figure  nondeterministic  query  is  not  intensionally 
more  expressive  than  deterministic  query. 

Proof:  Let  P  be  a  nondeterministic  program  which  computes  the  function  /,  and  suppose  it 
uses  am6(x,y),  with  x  ^  y.  Since  P  must  compute  a  deterministic  answer  (by  the  definition  of 
intensional  expressiveness),  the  amb(x^y)  must  be  “determinized”  somehow.  Essentially,  the  only 
way  that  could  be  achieved  is  to  throw  it  out:  either  not  use  it,  or  use  it  in  one  branch  of  a 
conditional  that  always  chooses  the  other  branch,  or  use  it  as  the  argument  to  a  function  whose 
result  is  independent  of  its  input.  But  then  we  can  certainly  write  a  deterministic  version  of  P  that 
computes  /  with  the  same  efficiency.  □ 
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_  /  (not  (P1 

\  (u,- i-)> 


(not  (poll  a;0”),  .L), 


Vpat[xa  is  _,:Jp 

exhausted^0,  i,  j,  p\ , .  ..,Pk,p)  =  { 
exhausted  (xL,i,j,pi,...,pk,p)  = 


tt, 

ff, 

tt , 

ff, 


if  exhausted^,  i,  j, pi, . . .  ,Pfc,  p) 
otherwise 

((3/,  m.  £>[ey]p  =  tt  A  £>[emi]p  =  if )  V 
(3/.  ©[czjjjp  =  y0))  A  (piV  ft  •••  ft P/c\i) 
otherwise 

if  (3/.  V{eijjp  =  yL)  A  (pi\jf  ft  *  •  *  ft  P/cl?) 
otherwise 


Figure  4.6:  Semantics  for  nondeterministic  query 


We  still  want  to  restrict  ourselves  to  programs  that  compute  functions,  even  though  they  may  use 
nondeterministic  query,  but  we  need  a  different  interpretation  for  the  meaning  of  nondeterministic 
query.  One  possibility  is  shown  in  Figure  4.6.  The  idea  is  to  allow  the  “don’t  care”  symbol 
to  represent  J_  in  certain  circumstances,  e.g.,  when  the  corresponding  pattern  position  has  been 
exhaustively  checked  for  all  other  alternatives,  and  the  remainder  of  the  pattern  is  consistent.  The 
semantics  is  the  same  as  in  Figure  4.2  except  for  the  “don’t  care”  symbol.  Let  us  say  that  such  a 
symbol  is  found  at  location  j  in  the  ith  pattern  (written  ).  If  the  other  patterns  exhaustively  check 
the  input  at  location  j  and  are  otherwise  consistent  (p\j  refers  to  the  pattern  p  without  location  j), 
then  the  meaning  of  matching  xa  to  is  a  poll  of  the  input.  Poll  is  a  nondeterministic  construct 
[68]  which  checks  whether  an  input  is  available: 

poll  _L  =  jff ,  poll  x  =  tt,  if  x  ^  JL. 

The  pi  through  p ^  are  the  k  patterns  from  the  general  form  of  a  query  ( cf .  Figure  4.1(b)).  The 
notation  ey  refers  to  the  element  at  location  j  in  the  Ith  pattern.  The  function  exhausted  has  two 
cases:  if  the  input  is  of  type  boolean,  then  it  looks  for  both  a  tt  and  ff  elements  in  the  corresponding 
position  in  the  other  patterns.  If  the  input  is  an  integer,  it  looks  for  a  variable  in  the  corresponding 
position  (since  once  cannot  exhaust  all  other  integers  by  enumeration  in  the  other  patterns).  In 
both  the  boolean  and  the  integer  case,  the  remainder  of  the  pattern  is  checked  for  consistency. 

Example  4.4.2  We  present  examples  of  interpreting  “don't  care”  as  _L  and  others  where  we  do 
not.  In  the  following: 

tt  (tt,  i)  (i,ff) 

ff  COM)  (x,ff) 

(_,i)  Uff), 

the  “ don’t  care”  is  interpreted  as  _L.  However ,  in  the  following  two  examples  it  is  not : 

(tt,  i)  (i,ff) 

(ff,0)  (2,ff) 

Li)  Uff)- 

Now  we  can  identify  a  subset  of  the  nondeterministic  queries  which  return  deterministic  results, 

assuming  we  can  detect  undefined  inputs.  When  we  have  a  “_y”  interpreted  as  ±  we  can  take  its 
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meaning  to  be  =  ★,  where  we  have  added  the  element  ★  to  the  flat  domains 

with  ICi.  Then  under  the  above  definition  of  consistency  we  obtain  the  desired  queries. 

An  example  of  a  query  which  returns  deterministic  answers,  assuming  we  can  detect  undefined 
inputs,  is  the  not±  program  presented  earlier.  Because  D[_]p  =  *,  the  three  patterns  are  not 
consistent,  and  so  it  is  fine  for  the  result  of  the  last  pattern  to  be  different.  If  we  extend  the  not± 
query  with  another  line,  such  as  _  =>  ff,  then  the  query  will  no  longer  return  deterministic  answers, 
since  both  symbols  will  be  interpreted  as  _L. 

We  call  the  extension  of  PCF  with  deterministic  query  DPCF,  and  the  extension  with  non- 
deterministic  query  NPCF.  It  is  quite  obvious  that  NPCF  is  extensionally  more  expressive  than 
DPCF,  but  we  are  interested  in  the  following  question:  Is  NPCF  intensionally  more  expressive  than 
DPCF?  Since  NPCF  is  an  extension  of  DPCF  it  can  certainly  compute  as  efficiently  as  DPCF. 
However,  to  show  that  NPCF  is  more  expressive,  we  must  exhibit  a  function  expressible  in  both 
DPCF  and  NPCF,  and  prove  that  DPCF  cannot  compute  it  as  efficiently.  This  question  turns  out 
to  be  analogous  to  a  problem  in  computational  complexity  theory,  that  of  comparing  monotone 
and  De  Morgan  boolean  circuits.  Before  showing  why,  we  return  to  our  circuit  semantics. 

4.5  Circuit  semantics  revisited 

As  we  have  seen,  the  basic  idea  of  circuit  semantics  is  very  simple  and  has  much  in  common  with 
dataflow  networks:  view  each  construct  of  PCF  as  a  gate,  and  view  a  computation  as  data  flowing 
through  the  gate.  The  whole  program  becomes  a  circuit.  Earlier  we  considered  the  circuits  as  being 
executed  bottom-up.  This  reflected  our  intuition  about  recursion  being  unwound  on  demand.  But 
now  we  wish  to  view  them  as  being  executed  top-down,  given  our  upcoming  comparison  with 
boolean  circuits.  We  shall  also  make  very  precise  the  relationship  between  the  dimensions  of  a 
circuit  and  various  parallel  evaluation  orders  for  PCF,  and  introduce  new  gates  to  model  queries. 

4.5,1  Circuits  for  PCF 

Figure  4.7(a)  (b)  shows  the  circuits  representing  constants  and  variables.  The  truth  values  and 
the  integers  are  represented  by  nodes  with  no  inputs.  Nodes  may  have  several  outputs  (fan-out 
is  unlimited).  Each  of  the  functional  constants  is  a  node  with  the  required  number  of  inputs.  A 
variable  is  represented  by  a  wire,  or  for  higher-order  functions,  a  placeholder  circuit  labelled  with 
the  variable  name.  Figure  4.7(c)  shows  an  input  that  is  ignored;  we  need  such  a  convention  to 
write  functions  like  the  K  combinator.  The  representation  of  conditionals  in  Figure  4.7(a)  shows 
one  of  the  essential  differences  with  dataflow  networks,  which  use  switch  and  merge  nodes  to  avoid 
evaluating  more  than  one  branch  of  the  conditional  during  data-driven  (top-down)  execution.  For 
demand-driven  (bottom- up)  execution  of  the  network  the  difference  is  irrelevant.  Figure  4.7(d)  (e) 
shows  the  representation  of  lambda  abstraction  and  application.  This  points  out  the  other  major 
difference  with  dataflow  networks:  there  are  no  application  nodes.  Figure  4.8  shows  the  circuits 
representing  three  simple  PCF  terms.  Note  that  our  representation  builds  in  sharing  of  arguments 
of  ground  type  (but  functional  arguments  are  not  shared). 

Since  it  is  rather  difficult  to  reason  about  circuit  dimensions  in  pictorial  form,  we  define  a 
model  for  the  dimensions  of  a  circuit.  Figure  4.9  is  an  extended  semantics  for  PCF,  returning 
step-counting  versions  of  a  term,  S- Terms  (cf.  t-programs  in  [77]),  computing  its  depth  and  size. 
Evaluating  the  step-counting  depth  and  size  translations  of  a  program  gives  us  its  depth  and  size. 
We  assume  renaming  of  bound  variables,  so  that  all  identifiers  are  unique.  The  p  environment  in  the 
size  translation  ensures  sharing  of  arguments  of  ground  type.  Whenever  a  variable  x  of  ground  type 
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Figure  4.7:  Circuits  for  (a)  constants,  (b)  variables,  (c)  ignored  inputs,  (d)  abstraction,  (e)  appli¬ 
cation,  (f)  query 


Figure  4.8:  Examples:  (a)  (\x.  x  +  x)2,  (b)  A xy.  x ,  (c)  A fx.  f(fx) 
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is  encountered,  the  environment  r)  is  checked.  If  the  variable  does  not  occur  in  the  environment, 

i.e.,  eta(x )  =  _L,  then  rj  is  extended  with  the  binding  x  i-»  1.  On  subsequent  encounters  of  the  same 
variable,  its  size  will  not  be  counted.  Recall  that  we  assume  renaming  of  bound  variables,  in  order 
to  avoid  any  clashes  in  rj.  tx\  is  the  first  projection.  The  syntax  of  S-Terms  uses  A  rather  than  A 
merely  to  emphasize  the  distinction  between  the  programming  language  and  the  meta-language. 

Example  4.5.1  Consider  the  program  (Xx.  x  +  x)2,  whose  circuit  semantics  is  depicted  in  Fig¬ 
ure  4-8(a).  The  meaning  of  the  program  under  the  extended  semantics  is: 

£\{Xx.  x  +  x)2]T  =  (4,  (Ax.  1  +  max(x,  x))l,  (Ax.  1  +  x  +  0)1). 

Obtaining  the  S-term  for  the  depth  is  straightforward.  We  describe  in  more  detail  how  the  size 
S-term  is  calculated: 

size((Ax.  x  +  x)2)J_  =  (si,S2,f?),  where 
{s\,rf)  =  size(Ax.  x  +  x)T  =  (Ax.  s[,r}'), 

(S2,v")  =  size(2  )rf. 

It  is  in  the  evaluation  of  =  size(a;  +  x)±  that  we  use  the  environment: 

s[  =  size(#  +  rc)±  =  (1  +  su  +  su,  rj  12),  where 
(sn,Vn)  =  size(x)±  =  (x,x^  1), 

(si2,Vi2)  =  size(x)(x  i  y  1)  =  (0j 3?  1  y  1). 

From  the  above  we  deduce  that: 

size{ Xx.  x  +  z)_L  =  (A#.  1  +  x  +  0, x  1),  and 
size(2)(x  1)  —  (l,x  1  ^  1). 

Evaluating  the  depth  and  size  step-counting  programs ,  we  get  a  depth  and  size  of  2,  which  match 
the  circuit  dimensions  in  Figure  4- 8 (a). 

We  now  prove  that  the  circuit  dimensions  are  indeed  matched  by  the  extended  semantics. 

Proposition  4.5,2  For  a  PCF  program  M ,  £|M]_L  =  (u,  d,  s)  if  and  only  if  the  circuit  representing 
M  has  depth  d  and  size  s. 

Proof:  By  induction  on  the  structure  of  M.  We  need  an  induction  hypothesis  that  works  at  higher 
types  ( cf  [70]  for  a  similar  example).  We  define  predicates  Match0^  by  induction  on  types: 

1.  If  Ma  is  a  program,  then  Ma  has  property  Match*7  if  £{Ma}±  =  (v,d,s)  if  and  only  if  the 
circuit  representing  M  has  depth  d  and  size  s. 

2.  If  is  a  closed  term,  then  it  has  property  Match*7  if  whenever  Na  is  a  closed  term 

with  property  Match*7,  M*r~>T  JV*7  has  property  MatchT. 

3.  If  M*7  is  an  open  term  with  free  variables  x* 1 , . . . ,  x°n ,  then  it  has  property  Match*7  if  the  term 
[Ni/xi]  •  •  •  [Nn/xn]Ma  has  property  Match*7  whenever  7Vi,...,jVn  are  closed  terms  having 
properties  Match*71 , . . . ,  Match*771 ,  respectively. 
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S\ Terms  — >  Environments  — >(UAr  x  S-Terms  x  S-Terms) 
S\M\p=  depth (M),iri(size(M)E)),  where 


depth :  Terms  — >  S-Terms 
depth  (tt)  =  1 
depth  (n)  =  1 

depth{M\  =  M2)  =  1  +  max(depth(M\),  depth(M2 )) 

depth(M\  +  M2)  =  1  +  m&x(depth(Mi),  depth(M2)) 

depth(DL  Mi  M2  M3)  =  1  +  max(depth(Mi),  depth(M2),  depth(Ms)) 

depth  (xa)  =  a:*7 

depth(Mi  M2)  =  depth(Mi)  depth(M2) 
depth  (\xa .  M)  =  Aa:7.  depth(M) 


size:  Terms  — >  Environments—* (S-Terms  x  Environments) 
size(tt)rj  =  (1,  77) 
size(n)r]  =  (1, 77) 

size(M\  =  M2)r]  =  (1  +  si  +  S2,r/"),  where  ( si,r f)  =  size(Mi)r],  (s2,??")  =  size(M2)rj' 

size{M\  +  M2)r]  =  (1  +  Si  +  S2,rj"),  where  (si,?/)  =  size(Mi)r),  (s2,r/")  =  size(M2)rj' 

size(DL  M\  M2  M3)r]  =  (1  +  si  +  s2  +  53,77'"),  where 

(si,r)')  =  size(Mi)rj,  {s2,r}")  =  size(M2)r}',  (s3,r]"')  =  size(M2)r/" 
(a;°',77[a;0'  h*  1]),  if  a  €  {o,  i}  and  r](x<T)  =  _L 

size(xcr)r}  =  <  (0,77),  if  a  6  {o,  l)  and  77(3;°')  =  1 

(a;*7, 77),  otherwise 

size(Mi  M2)tj  =  (siS2,77"),  where  («i ,  77')  =  size{M\)i 7,  (52,77")  =  size{M2)r{ 
size(\xa.  M)rj  =  (Aa;7.  «i, 77'),  where. (si, 77')  =  size(M)rj 


Figure  4.9:  Extended  semantics  for  PCF 
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We  call  a  term  M°  matching  if  it  has  property  Match'7.  We  need  to  prove  that  all  terms  are 
matching.  This  is  obvious  for  boolean  and  integer  constants.  We  only  consider  one  functional 
constant,  as  the  proof  is  similar  for  the  others. 

1.  M  =  Mi  +  M2.  By  induction  hypothesis,  Mi,  M2  have  property  Match6.  We  have  several 
cases  depending  on  whether  the  terms  M\ ,  M2  are  closed  or  open: 

(a)  Mi,  M2  are  both  closed.  Then  by  case  (1)  of  the  definition  of  Match6,  £[Mi]_L  = 
(vudusi)  and£[M2]_L  =  (^25^2^2)  iff  the  circuits  representing  Mi,  M2  have  dimensions 
d\,  S\  and  cfe,  $2,  respectively.  Since  both  terms  are  closed  there  is  no  possible  sharing,  so 
by  the  definition  of  £  and  of  the  circuits,  the  depth  and  size  of  M1+M2  are  l+max(di , 

1  +  5i  +  52,  respectively. 

(b)  Mi  is  open,  M2  is  closed.  By  case  (3)  of  the  definition  of  Match6,  any  closed  instantiation 
of  Mi  with  matching  terms  will  have  property  Match6.  Since  M2  is  closed,  there  is  no 
sharing  between  Mi  and  M2,  so,  as  before,  the  dimensions  match. 

(c)  Mi,  M2  are  both  open.  Consider  the  set  S  =  FV (Mi)  P\  FV (M2) ,  restricted  to  variables 
of  ground  type.  By  case  (3)  of  the  induction  hypothesis,  any  closed  instantiation  of 
Mi,  M2  with  matching  terms  will  have  property  Match6.  Then  the  depth  of  Mi  +  M2 
will  be  the  same  in  the  two  models,  since  sharing  is  not  relevant.  The  size  of  Mi  +  M2 
in  the  circuit  model  will  only  count  each  instantiation  of  a  shared  variable  xa  E  S  once. 
But  that  is  exactly  what  the  77  environment  is  for  in  the  £  model.  Therefore,  the  size 
will  also  match. 

2.  M  ~  xa .  Any  closed  instantiation  of  xa  by  a  term  satisfying  Match'7  will  have  the  same 
property. 

3.  M  =  Mf  “^Mf  •  By  the  induction  hypothesis,  Mf  ~*T  satisfies  property  Match'7  and  Mf 
has  property  Match'7.  If  both  Mi,  M2  are  closed,  then  by  case  (2)  of  the  induction  hypothesis, 
Mf  ”*TMf  has  property  Matchr.  If  Mi,  M2  are  open,  then  construct  the  set  S  as  above.  Any 
closed  instantiation  of  Mi ,  M2  with  matching  terms  will  result  in  the  sharing  of  x  £  S,  both 
in  circuits  and  in  the  £  model.  Therefore  Mf  “"^Mf  will  have  property  Match7". 

4.  M  =  Ax0-.  Mr.  By  the  induction  hypothesis,  Mr  has  property  Match7".  We  need  to  show 
that  Ax'7.  MT  has  property  Match0- “*>T.  Suppose  that  Ax'7 .  Mr  is  closed.  If  a  E  {o,  t},  then 
for  any  input  Na  both  the  circuits  and  the  £  model  will  only  include  one  copy  of  Na.  The 
depth  function  from  the  £  model  actually  does  no  sharing,  but  this  is  irrelevant,  as  the  depth 
is  unaffected.  Then  (Ax'7.  Mr)N(T  will  have  property  Match7".  For  general  a,  there  are  no 
sharing  issues,  so  the  result  again  has  property  Match7".  Now  suppose  Ax'7.  Mr  is  open. 
Any  closed  instantiation  of  Ax'7.  Mr  with  matching  terms  will  result  in  the  same  sharing 
of  common  variables  in  both  circuits  and  the  £  model.  Therefore,  Ax'7.  M7"  has  property 
Match'7  ^r. 

Since  all  Ma  have  property  Match'7,  certainly  the  programs  enjoy  this  property  as  well,  which  by 
case  (1)  of  the  definition  of  Match67  establishes  our  proposition.  □ 

Before  undertaking  a  comparison  of  our  circuit  model  with  various  evaluation  strategies,  we 
discuss  the  issues  involved.  Consider  the  following  PCF  program:  P  =  DL  tt  2  M,  where  M  is  an 
expression  whose  evaluation  takes  many  steps  before  returning  an  integer.  The  circuit  represen¬ 
tation  of  P  will  include  a  piece  corresponding  to  M.  But  in  call-by-name  PCF  (or  call-by-need, 
a  particular  implementation  of  call- by-name),  M  will  not  be  evaluated,  so  the  size  of  the  circuit 
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representing  P  will  be  a  wild  overestimate.  Of  course,  we  could  evaluate  the  circuit  bottom-up, 
thus  avoiding  evaluation  of  M,  but  this  would  preclude  any  meaningful  discussion  of  circuit  size  or 
depth.  The  problem  is  that  circuits  are  most  closely  related  to  dataflow  networks,  which,  in  turn, 
are  most  naturally  implemented  in  a  data-driven  fashion,  an  embodiment  of  call-by-value.  Even 
though  call-by-name  (call-by-need)  can  also  be  implemented  in  parallel  using  graph  reduction,  that 
model  is  basically  the  equivalent  of  upside-down  demand-driven  evaluation  of  a  dataflow  network. 
Thus  it  seems  reasonable  to  confine  ourselves  to  comparisons  with  evaluation  strategies  which  are 
natural  for  dataflow  networks. 

We  compare  our  circuit  model  to  two  different  evaluation  strategies:  parallel  call-by- value  (c-b- 
v)  [49,  9]  and  call-by-speculation  (c-b-s)  [49,  75,  41].  Given  an  application  /  x,  in  parallel  c-b-v  the 
function  /  and  the  argument  x  are  evaluated  in  parallel  and  after  both  evaluations  are  completed, 
then  the  body  of  the  function  /  is  evaluated.  Therefore,  parallel  c-b-v  takes  advantage  of  horizontal 
parallelism  (evaluating  two  tasks  simultaneously),  and  also  to  a  lesser  extent  of  vertical  parallelism 
(pipelining). 

In  c-b-s,  we  also  evaluate  /  and  x  in  parallel,  but  we  do  not  require  that  the  evaluation  of  x 
be  complete  before  we  evaluate  the  body  of  /.  Thus  c-b-s  allows  fully  pipelined  parallelism.  If  x 
gives  rise  to  a  large  computation  that  is  not  used  by  /  we  will  get  a  result  much  faster,  but  the 
computation  will  still  continue  for  some  time.  C-b-s  thus  introduces  a  distinction  between  minimum 
and  maximum  time  to  evaluate  an  expression. 

Figure  4.10  shows  a  profiling  semantics  for  parallel  c-b-v  in  the  style  of  [9].  Judgments  have  the 
form  p  b  M  — >cbv  d,  w ,  meaning  that  in  environment  p,  M  evaluates  to  v  in  d  steps  (depth) 
and  with  w  work.  The  possible  results  of  an  evaluation  are  values,  ranged  over  by  v ,  and  they  are 
either  constants  or  function  closures: 

v  : :  =  c  |  cl(p,x ,  M). 

The  rule  for  addition  is  typical  of  the  treatment  of  most  constructs,  in  that  the  depth  of 
the  computation  is  the  maximum  of  the  depths  of  the  subcomputations,  with  the  addition  of  a 
constant  for  the  evaluation  of  the  construct  itself.  The  size  of  of  the  computation  is  the  sum  of  the 
subcomputation  plus  a  constant.  In  the  case  of  conditionals,  the  condition  is  evaluated  first,  and 
then  the  appropriate  branch  is  chosen.  Finally,  the  rule  for  application  shows  that  evaluation  of 
the  function  body  waits  for  evaluation  of  the  argument  to  complete. 

The  constants  chosen  for  the  depth  and  work  in  several  of  the  rules  are  different  from  [9],  as  we 
want  to  achieve  an  exact  match  with  our  circuit  semantics.  The  differences  are  not  significant. 

The  correspondence  between  our  circuit  semantics  and  parallel  c-b-v  is  fairly  simple:  as  long  as 
we  have  conditionals  in  a  program,  the  models  are  incomparable,  for  the  reasons  mentioned  earlier. 
However,  without  conditionals,  circuits  will  take  less  time  and  do  less  work  than  predicted  by  the 
operational  semantics,  in  the  following  sense: 

Proposition  4.5.3  Given  a  PCF  program  M  with  £[M]±  =  {v,d,  s),  and  _L  b  M  — »cbv  v;t,w, 
the  following  hold: 

1.  If  M  is  conditional- free,  then  s  <  w ; 

2.  If  M  is  conditional- free,  then  d  <  t. 

Proof:  By  induction  on  M  as  in  the  proof  of  Proposition  4.5.2.  We  define  predicates  Vaf7  by 
induction  on  types: 
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p  \~  tt  ^cbv  It]  1)  1 

p  h  Ml  - ^Cbv^li^l^l  P  F  M2  - »-cbv  ^21^2^2 

p  I-  Ml  +  M2  — ^cbv  Vi  +  v2;  1  +  max(di,  d2),  1  +  wi  +  w2 

P  I-  Ml  - »cbv  tt^dipwi  pJrJVh, - >-cbv 

p  h  Dt  MiM2M3  — >-cbv  v2 ;  1  4-  di  +  d2, 1  +  u>i  +  w2 

p  h  Mi  — hbvff^d^wi  p  H  M3  — »cbv  ^3;c?3,W3 

p  I-  Dt  MiM2M3  — »cbv  v3;  1  +  di  +  d3, 1  +  toi  +  u;3 
p(x)  =  v 

p  I-  x  — >-cbv  ■u;0,0 

p  h  Ax.  M  — >Cbv  cl(p,x,M);  0,0 


p  h  Mi  — >cbv  cl(p',x,M[);di,Wi 
p  \-  M2  — >cbv  v2-,d2,w2 


p'[x/v2]  h  M{  — >cbv  v;d3,w3 


p  \-  M\  M2  — >cbv  v ;  max(d!,d2)  +  d3,w i  +w2  +  w 3 


Figure  4.10:  Profiling  semantics  for  parallel  call-by- value 
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1.  If  Ma  is  a  program,  then  Ma  has  property  VaF  if  £[M'7]J_  =  ( v,d,s )  if  and  only  if  _L  H 
M  — >cbv  v ;  t,  w,  with  s  <  w,d  <  t. 

2.  If  M'7”*7'  is  a  closed  term,  then  it  has  property  VaF~*r  if  whenever  Na  is  a  closed  term  with 
property  VaF,  M'7”*7’  Na  has  property  ValT. 

3.  If  M*7  is  an  open  term  with  free  variables  xf1, . . .  ,  xfn,  then  it  has  property  VaF  if  the 
instantiation  [IVj/x  i]  •  •  •  [Nn/xn]Ma  has  property  VaF  whenever  Ni,...,Nn  are  closed  terms 
having  properties  VaF1 , . . . ,  VaF" ,  respectively. 

The  result  is  easily  verifiable  for  constants  (except,  of  course,  for  conditionals),  so  we  only  consider 
the  induction  step: 

1.  M  =  xa .  Any  closed  instantiation  of  x°  by  a  term  satisfying  VaF  will  have  the  same  property. 

2.  M  =  Mf  ”*TMf.  By  the  induction  hypothesis,  Mf  ""*T  has  property  VaF  ”* r  and  Mf  has 
property  VaF.  If  both  Mi,M2  are  closed,  then  by  case  (2)  of  the  induction  hypothesis, 
Mf^Mf  has  property  ValT.  If  Mi,  M2  are  open,  then  construct  the  set  S  =  FV(Mi)  fl 
FV(M2),  restricted  to  variables  of  ground  type.  Any  closed  instantiation  of  Mi,  M2  with 
closed  terms  satisfying  Val  will  result  in  the  sharing  of  xeS,  in  both  circuits  and  parallel 
c-b-v.  Therefore  Mf  ^TMf  will  have  property  VaF. 

3.  M  =  Ax'7.  MT .  We  need  to  show  that  Ax'7.  MT  has  property  VaF”*7".  By  the  induction 
hypothesis,  MT  has  property  VaF.  Suppose  that  Ax'7.  MT  is  closed.  Let  Nc  be  a  closed  term 
satisfying  VaF.  If  Mr  is  closed  (i.e.,  xa  is  not  used  in  M7"),  then  depth(( Ax'7.  MT)NC)  = 
depth(MT)  and  similarly  for  size.  Since  MT  has  property  VaF,  so  must  (Ax'7.  MT)Na  (parallel 
c-b-v  requires  more  time  and  work  to  evaluate  (Ax'7.  MT)NC  than  Mr).  This  is  the  origin  of 
the  inequalities  in  the  proposition:  parallel  c-b-v  will  evaluate  unused  arguments. 

If  M7"  is  open,  then  if  a  E  {o,  t},  Na  will  be  shared  in  both  circuits  and  parallel  c-b-v,  so  the 
depth  and  work  will  be  the  same.  For  general  a,  by  the  third  part  of  the  induction  hypothesis, 
[Na / xa]MT  will  have  property  VaF. 

Now  suppose  Ax'7.  Mr  is  open.  Any  closed  instantiation  of  Ax'7.  MT  with  closed  terms 
satisfying  Val  will  result  in  the  same  sharing  of  common  variables  of  ground  type  in  both 
circuits  and  parallel  c-b-v.  Therefore,  Ax'7.  MT  has  property  VaF  ’  r. 

Since  all  terms  M*7  have  property  VaF,  so  do  the  programs,  which  establishes  our  result.  □ 

Figure  4.11  shows  a  profiling  semantics  for  c-b-s,  in  the  style  of  [41],  Because  of  the  distinction 
between  minimum  and  maximum  depth,  the  judgments  are  different  from  the  parallel  c-b-v  ones. 
They  are  of  the  form 

p,d  M  — tcbs  v;d’,d',w, 

meaning  that  in  environment  p,  evaluating  M  at  depth  d  leads  to  result  v  which  will  be  available 
at  depth  d',  while  the  whole  computation  will  be  done  at  depth  dl  and  use  w  work.  In  addition, 
the  environment  p  has  to  keep  track  at  which  depth  a  value  becomes  available.  This  is  equivalent 
to  the  effect  achieved  in  Roe’s  semantics  [75]  with  time  stamps. 

The  rules  for  constant,  addition,  and  abstraction  are  not  significantly  different  from  the  parallel 
c-b-v  case,  because  minimum  and  maximum  depth  are  the  same.  The  rule  for  variables  takes 
into  account  the  depth  at  which  a  value  becomes  available.  The  rules  for  conditionals,  show  that 
minimum  depth  is  the  same  as  before,  but  maximum  depth  incorporates  the  evaluation  of  all 
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p,dh  tt  — >-cbs  tt;  d  +  1,  d  +  1, 1 

_ l~  Ml  - ^cbs  Vl ;  <^1 )  d\ ,  U>1  I-  M2  - >cbs  _ 

p,d  h  Mi  +  M2  — »cbs  t>i  +  ^2;  1  +  max(di,(i2),  1  +  max(di,d2),  1  +  t«i  +  u)2 
p,d  h  Mi  — »cbs  tt;di,di,wi 


p,d  M2 — >Cbs  v2;d2,d2,w2 


p,d  h  M3  — >-cbs  v3;d3,d3,w3 


p,d  h  Dt  MiM2M3  — »cbs  v2;  1  +  max(di,d2),  1  +  max(di,  d2,  d3),  1  +  w\  +  w2  +  w3 

P,d  h  Mi  — Kbsff',di,di,w1 


P,d  \~  M2 — >cha  v2;d2,d2,w2 


p,d  \~  M3  — ►cbs  v3;d3,d3,w 3 


p,d  hD(  M\M2M3 - tcbs  v3;  1  +  max(di,d3),  1  +  max(di,d2)^3),  l  +  wi  +  w2  +  w3 

p(x)  =  v;  d! 


p,d  (-  x — »cbs  v;max(<i,  d'),m&x(d, d'),0 
p,d  Xx.  M  — >C\}S  cl(p,x,M);d,d,  0 

P,d  \~  Mi  — >cbs  cl(p' ,x,M[)-,di,di,wi 
p,d  I-  M2  — >cbsV2]d2,d2,w2 


p'[x/v2;d2],di  h  M[  — >cbs  v;d3,d3,w3 


p,d  I-  Mi  M2  — >cbs  v;  d3,  max.(di,d2,d3),  wi+w2  +  w3 


Figure  4.11:  Profiling  semantics  for  call-by-speculation 
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branches.  Similarly,  the  maximum  depth  of  an  application  includes  the  evaluation  of  the  function, 
the  argument,  and  of  the  function  body,  while  the  minimum  depth  will  only  include  the  evaluation 
of  the  argument  if  it  is  used. 

It  should  be  fairly  clear  by  now  that  the  circuit  semantics  is  very  closely  related  to  c-b-s.  In  fact, 
it  would  be  a  slight  improvement  on  it  were  it  not  for  conditionals.  A  circuit  will  always  do  less 
work  than  c-b-s,  but  its  depth  might  be  larger  than  the  c-b-s  minimum  depth.  In  a  conditional-free 
program  P,  however,  the  depth  of  the  circuit  semantics  of  P  is  the  same  as  the  c-b-s  minimum 
depth. 

Proposition  4.5.4  If  £[MJ. L  =  (v,d,s),  and  1,0  h  M  — »CbS  v']t,t,w,  then  v  =  v'  and  we  have: 
(1)  s  <  w;  (2)  d  <  t;  (3)  if  M  is  conditional- free,  then  d  =  t. 

Proof:  Similar  to  the  proof  of  Proposition  4.5.3.  □ 

There  is  another  order  of  evaluation  to  which  we  could  compare  our  circuit  model.  Hudak  and 
Anderson  [49]  consider  parallel  eager  evaluation,  a  version  of  parallel  c-b-v.  Instead  of  requiring 
the  argument  to  complete  evaluating  before  the  body  of  the  function  is  evaluated,  we  require  that  it 
complete  before  the  function  call  returns.  This  gives  extra  parallelism  over  parallel  c-b-v,  but  might 
still  evaluate  some  arguments  unnecessarily.  The  correspondence  between  parallel  eager  evaluation 
and  circuits  would  be  the  same  as  that  between  parallel  c-b-v  and  circuits. 

4.5.2  Circuits  for  query 

The  circuit  semantics  of  query  is  a  straightforward  implementation  of  the  semantics  presented 
earlier.  Accordingly,  we  need  to  expand  our  set  of  basic  gates  with  those  of  Figure  4.7(f).  An 
“if-then”  gate  is  just  shorthand  for  an  if-then-else  statement,  with  a  _L  “else”  part.  And  is  parallel- 
and.  For  space  reasons  amb  is  depicted  at  times  with  more  than  two  inputs;  the  understanding  is 
that  a  k- ary  amb  gate  stands  for  a  balanced  tree  of  binary  amb  gates. 

Figure  4. 12(a)  shows  the  circuit  semantics  of  parallel-or.  The  not  gate  is  shorthand  for  D0  x  ff  tt. 
Figure  4.12(b)  shows  the  circuit  semantics  of  not±  from  Section  5.2,  which  essentially  performs 
negation  with  respect  to  _L  (note  how  “_”  is  used  to  represent  -L). 


4.6  On  the  method  and  the  metric 

This  section  describes  our  methodology  for  comparing  DPCF  and  NPCF.  First,  we  refine  the  notion 
of  intensional  expressiveness  to  take  into  account  both  measures  of  program  complexity  induced 
by  the  circuit  semantics:  work  and  time.  Then  we  compare  our  circuit  model  with  the  tradi¬ 
tional  parallel  computation  model,  the  PRAM,  establishing  a  simple  connection  via  an  analogue  of 
Brent’s  theorem.  Finally,  we  detail  the  hardware  assumption  we  make  that  allows  us  to  compare 
nondeterministic  and  deterministic  programs. 

4.6.1  Intensional  expressiveness  and  parallel  complexity 

Given  a  program  P,  the  circuit  semantics  of  P  provides  a  definition  of  its  parallel  complexity:  the 
parallel  time  and  the  parallel  work  required  to  execute  it.  Accordingly,  we  can  define  two  notions 
of  intensional  expressiveness. 

We  say  that  language  L\  is  intensionally  more  work-expressive  than  L?  (L\  >zw  L-R,  if  any 
function  computable  by  L 2  with  a  program  whose  size  under  the  circuit  semantics  is  S2  can  be 
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Figure  4.12:  Circuit  semantics  for:  (a)  por,  (b)  not± 


computed  by  an  L\  program  with  circuit  size  si  <  s2.  Similarly,  L\  is  intensionally  more  time- 
expressive  than  L2  (L\  L2),  if  any  function  computable  by  l.->  with  a  program  whose  depth 

under  the  circuit  semantics  is  62  can  be  computed  by  an  L\  program  with  circuit  depth  d\  <  d2. 

As  mentioned  in  the  introduction,  we  are  interested  in  asymptotic  complexity:  for  a  function  / 
of  n  arguments,  we  want  to  compare  the  size  and  depth  of  circuits  computing  /  as  functions  of  n. 
In  particular,  we  would  like  to  establish  separation  results  of  the  type  L\  yw  T2  and  L\  yt  T2. 

4.6.2  Comparison  with  the  PRAM 

We  compare  our  parallel  complexity  model,  circuit  semantics,  with  the  standard  model  from  the 
theory  of  parallel  algorithms,  the  PRAM  [23].  This  comparison  will  be  needed  later  when  we 
establish  a  connection  between  PCF  programs  and  boolean  circuits. 

Suppose  we  have  a  program  M  whose  parallel  complexity  according  to  the  circuit  semantics 
is  size  s  and  depth  d.  The  question  we  have  to  address  is  how  long  it  takes  to  execute  M  on  a 
p-processor  PRAM. 

We  can  imagine  the  process  of  executing  M  in  two  parts:  construction  and  execution.  First 
we  construct  the  circuit  semantics  of  M,  then  we  execute  it.  We  consider  the  execution  part  first, 
since  it  is  very  simple. 

Proposition  4.6.1  [Execution]  Given  the  circuit  semantics  of  program  M,  of  size  s  and  depth  d, 
it  can  he  simulated  on  a  p-processor  CREW  PRAM  in  0(s/p  +  d). 

Proof:  This  follows  from  Brent’s  theorem  [23],  which  establishes  bounds  for  simulating  boolean 
circuits  on  the  PRAM.  Our  circuits  satisfy  the  necessary  conditions  for  Brent’s  theorem:  we  have 
bounded  fan-in,  and  we  assume  that  each  gate  can  be  simulated  in  0(1)  time.  □ 

For  our  purposes,  we  do  not  need  a  Construction  proposition.  Rather,  we  want  to  know  if 
there  is  any  difference  in  construction  time  between  DPCF  and  NPCF  programs.  Since  the  only 
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difference  is  in  the  treatment  of  queries,  we  only  consider  the  time  needed  to  generate  circuits  for 
each  of  the  two  kinds  of  queries. 

Proposition  4.6.2  [Construction  equivalence]  It  takes  asymptotically  the  same  time  on  a  p -proc¬ 
essor  PRAM  to  construct  circuits  for  deterministic  and  nondeterministic  interpretations  of  a  query. 

Proof:  To  implement  deterministic  query  we  need  to  look  at  every  element  of  each  pattern  at  least 
once.  Then  the  time  required  to  process  a  query  on  n  variables  and  containing  k  patterns  be  must 
be  at  least  0(nk/p). 

For  the  nondeterministic  interpretation,  we  have  to  check  if  any  is  supposed  to  be  J_.  This 
can  be  achieved  as  follows:  First  check  if  any  column  in  the  pattern  exhaustively  checks  for  all 
possibilities,  and  identify  all  that  occur  in  those  columns.  Also,  keep  track  if  the  column 
consists  entirely  of  variables  and  This  can  be  done  in  0(nk/p). 

If  all  columns  contain  only  variables  and  ,  then  all  from  exhaustively  checked  columns 
are  _L,  and  we  are  done.  Otherwise,  pick  one  column,  call  it  i,  that  is  exhaustively  checked  with 
more  than  variables  and  and  verify  the  rest  of  the  pattern  for  consistency.  This  also  takes 
0{nk/p). 

If  consistent,  then  all  from  column  i  are  _L,  and  we  are  done,  since  including  column  i  in 
any  future  consistency  check  will  fail.  If  not  consistent,  then  again  we  are  done  for  the  same  reason. 

Since  we  only  take  0(nk/p )  to  find  out  which  mean  _L,  it  takes  the  same  time  to  generate 
circuits  for  both  deterministic  and  nondeterministic  query.  □ 

4.6.3  How  to  compare  determinism  and  nondeterminism 

Our  notion  of  intensional  expressiveness  involves  comparing  the  complexity  of  computing  func¬ 
tions,  so  we  have  to  make  sure  our  nondeterministic  programs  return  deterministic  values.  Since 
a  purely  nondeterministic  interpretation  of  query  fails  to  be  intensionally  more  expressive  (Propo¬ 
sition  4.4.1),  we  chose  an  interpretation  that  allows  us  to  get  deterministic  answers,  assuming  we 
can  detect  undefined  inputs.  This  section  discusses  our  assumption  in  more  detail. 

There  has  been  a  considerable  amount  of  research  in  the  area  of  algorithms  for  unreliable 
distributed  systems.  For  instance,  a  whole  field  is  devoted  to  consensus  problems  in  such  systems 
[34].  In  such  algorithms,  a  key  assumption  's  that  one  can  detect  the  failure  of  a  process  to  send 
a  message.  Without  this  assumption,  Le in  a  fully  asynchronous  model,  the  consensus  problem 
is  not  solvable.  The  physical  realization  of  this  assumption  is  quite  reasonable,  and  does  not  even 
require  fully  synchronous  hardware.  Lamport  [58]  has  shown  how  to  detect  failure  to  send  messages 
through  the  use  of  timeouts.  This  requires  a  model  with  accurate  clocks  and  bounds  on  message 
transit  times. 

We  shall  assume  that  we  are  able  to  detect  undefined  inputs  through  the  use  of  such  hardware. 
Since  DPCF  cannot  take  advantage  of  this  (by  the  definition  of  deterministic  query),  in  a  sense, 
our  question  has  become:  Does  the  ability  to  detect  undefined  inputs  in  NPCF  imply  that  NPCF 
^  DPCF  and/or  NPCF  yt  DPCF? 

The  difficulty  in  answering  this  question  stems  from  the  fact  that  we  have  to  exhibit  a  function 
computable  by  both  NPCF  and  DPCF,  but  which  can  be  computed  faster  or  with  less  work  by 
NPCF.  It  is  not  enough  to  say  that  NPCF  can  compute  generalized  functions  because  they  have  no 
deterministic  counterpart.  On  the  other  hand,  DPCF  is  quite  powerful.  It  can  express  parallel-or 
and  parallel-and,  so  it  can  implement  comparators  for  boolean  values  which  work  with  undefined 
inputs.  Using  comparators  we  can  implement  asymptotically  very  efficient  sorting  networks,  such 
as  the  AKS  network  [3],  which  can  sort  n  inputs  in  depth  O(lgn)  and  size  O(nlgn).  Since  a  DPCF 
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program  must  output  a  single  value  we  cannot  just  implement  the  sorting  network,  but  we  can  use 
it  internally,  for  instance,  to  compute  the  majority  function,  by  first  sorting  the  input  and  then 
picking  the  middle  value  [87] . 

Such  considerations  led  us  to  examine  more  closely  the  similarity  between  DPCF  and  NPCF 
programs  and  boolean  circuits.  It  turns  out  that  there  is  a  very  close  correspondence. 


4.7  A  connection  with  boolean  circuits 

This  section  contains  a  formal  connection  between  our  DPCF  and  NPCF  languages  and  boolean 
circuits.  Since  there  are  different  kinds  of  boolean  circuits  in  the  literature,  depending  on  the  basis, 
we  first  give  a  very  brief  overview  of  the  circuits  we  are  interested  in. 

4.7.1  Boolean  circuits 

A  boolean  circuit  computes  a  boolean  function  /  :  {0,  l}71  — >*{0, 1}.  For  our  purposes,  a  boolean 
circuit  is  a  directed  acyclic  graph  whose  inputs  are  sources  of  the  graph,  and  whose  nodes  (gates) 
are  selected  from  a  set  called  the  basis.  The  fan-out  of  the  inputs  and  gates  in  the  circuit  is 
unbounded.  The  fan-in  is  2.  Normally,  a  circuit  can  have  more  than  one  output,  but  we  are  only 
interested  in  single-output  circuits,  since  our  DPCF  and  NPCF  programs  only  return  one  output. 

Gates  in  the  basis  are  single  output.  Among  the  many  bases  studied,  two  are  of  interest  to 
us:  the  monotone  basis  {A,V},  and  the  De  Morgan  basis  {A,V,  ->}.  The  monotone  basis  is  not 
complete,  i.e.  not  every  boolean  function  is  computable  by  monotone  circuits  (only  the  monotone 
functions  are),  but  it  is  of  particular  interest  to  researchers  in  complexity  theory,  since  strong  lower 
bounds  have  been  obtained  for  monotone  circuits  [87].  The  De  Morgan  basis  is  complete. 

There  are  two  measures  for  the  efficiency  of  a  circuit.  The  size  or  complexity  is  the  number  of 
gates  in  the  circuit;  this  intuitively  measures  the  work  needed  to  compute  the  output.  The  depth  is 
the  number  of  gates  in  the  circuit  on  the  longest  path  from  the  input  to  any  output;  this  measures 
the  time  required  to  compute  the  output. 

There  are  several  complexity  classes  defined  in  terms  of  circuits.  We  define  those  which  will  be 
used  in  what  follows  (cf.  [10]).  A  family  of  circuits  is  a  sequence  (Ci,  C2,  . . . ),  where  Cn  takes  n 
input  variables.  A  uniform  family  is  one  where  the  description  of  Cn  can  easily  be  computed  from 
n.  The  classes  NC*  (AC*),  for  k  >  0,  consist  of  all  functions  computable  by  a  uniform  family  of 
polynomial  size,  0(log*n)  depth  circuits  with  constant  (unbounded)  fan-in.  NC  =  \Jk>0NCk,  and 
similarly  AC  =  Ufc>o  AC*.  It  is  easy  to  show  that  for  all  k  >  0,  AC*1  C  NC*+1  C  AC*+1,  therefore 
NC  =  AC. 

4.7.2  The  connection 

The  main  idea  of  the  connection  is  very  simple  and  can  best  be  described  by  a  picture.  Figure  4.13 
shows  the  domain  of  booleans  for  circuits,  and  the  domains  of  booleans  and  natural  numbers 
for  DPCF  and  NPCF.  Booleans  are  ordered  differently  in  the  circuit  world  than  in  programming 
language  semantics.  Monotone  means  the  same  thing  in  both  places,  but  with  respect  to  a  different 
ordering.  That  is  why  negation  is  not  monotone  in  the  case  of  circuits,  but  it  certainly  is  expressible 
in  PCF,  which  only  computes  continuous  (hence  monotone)  functions.  Therefore,  0  has  the  same 
role  in  the  circuit  world  that  _L  has  for  PCF  programs. 

When  a  monotone  circuit  is  presented  with  an  input,  there  is  no  way  for  the  circuit  to  determine 
if  that  input  is  0,  since  all  it  can  do  is  AND  and  OR  the  inputs.  This  is  really  the  same  situation 
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Figure  4.13:  Domains  for  (a)  boolean  circuits,  (b)  DPCF  and  NPCF  programs 


as  when  a  DPCF  program  can  have  undefined  inputs.  The  DPCF  constructs  cannot  help  identify 
undefined  inputs. 

The  effect  of  the  hardware  assumption  we  made  is  now  clear.  NPCF  can  detect  undefined 
inputs,  so  it  essentially  has  negation  with  respect  to  bottom  (cf.  Figure  4.12(b)).  Therefore,  NPCF 
resembles  De  Morgan  circuits. 

Since  DPCF  and  NPCF  operate  on  a  different  domain  than  boolean  circuits,  we  need  a  way  to 
take  a  function  computed  by  a  boolean  circuit,  say  /  :  {0,  l}n  — >*{0, 1}  and  view  it  as  a  function 
on  PCF  domain  elements,  say  _L,  tt  (the  choice  of  tt  rather  than  ff  is  arbitrary).  Let  us  call  the 
equivalent  of  /  in  the  PCF  world  fpcf,  Conversely,  given  g :  {_L,  tt}n  — ^{JL,  tt}  in  PCF,  we  define 
9bool  1°  be  its  counterpart  in  the  circuit  world. 

DPCF  and  NPCF  are  at  least  as  powerful  as  boolean  circuits.  Given  a  circuit,  we  can  divide  it 
into  “blocks,”  each  implementable  by  a  DPCF  (NPCF)  function,  and  select  a  suitable  application 
order  that  does  not  duplicate  work,  thus  constructing  an  equivalent  program.  For  the  reasons  listed 
above,  monotone  circuits  are  as  powerful  as  DPCF  when  presented  with  undefined  inputs. 

Proposition  4.7.1  Given  a  monotone  (De  Morgan)  circuit  computing  f  we  can  construct  from  it 
a  DPCF  (NPCF)  program  computing  fpcf,  with  the  same  dimensions  in  the  circuit  semantics. 

Proof:  We  only  discuss  the  monotone  case  here  as  the  other  case  is  similar. 

It  is  clear  that  the  only  problems  we  might  encounter  occur  when  the  fan-out  is  larger  than  1 
for  some  nodes.  If  the  fan-out  is  always  1,  the  circuit  is  a  tree,  and  can  be  represented  by  a  formula. 
If  the  fan-out  is  larger  than  1  for  some  node,  then  we  have  to  be  careful  we  do  not  duplicate  work 
in  the  DPCF  program.  Since  a  link  in  the  circuit  is  equivalent  to  an  application  in  DPCF,  we 
want  to  figure  out  an  application  order  such  that  in  the  circuit  semantics  we  end  up  with  the  same 
structure  as  the  original  circuit. 

We  give  an  algorithm  to  construct  the  DPCF  program.  First,  we  identify  all  nodes  with  fan-out 
>  1  in  the  circuit.  Then,  beginning  at  the  bottom,  we  identify  all  the  parts  of  the  circuit  which  are 
trees.  That  is,  we  go  as  far  as  possible  up  the  circuit  in  all  directions,  until  we  get  to  the  recipient 
of  a  multiple  output.  This  determines  one  DPCF  function.  We  continue  in  similar  fashion  from 
the  origins  of  the  multiple  outputs.  We  have  now  divided  the  circuit  into  blocks,  and  all  we  need 
to  do  is  figure  out  an  application  order. 

We  start  with  the  last  block  in  the  circuit  (the  one  containing  the  output  node);  call  it  Fq.  Let 
N  be  the  set  of  all  of  its  immediate  neighbors.  Fq  will  be  a  function  of  \N\  arguments.  We  apply 
Fq  to  all  elements  of  N  from  which  there  is  no  path  going  into  another  element  of  N.  This  works 
because  there  are  no  cycles  in  the  circuit.  We  continue  like  this  with  all  elements  from  N  which 
were  used,  and  which  receive  inputs.  A  minor  issue  is  the  ordering  of  the  inputs  when  doing  the 
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Figure  4.14:  Two  circuits 


applications.  We  need  to  order  them  so  that  the  element  of  N  for  which  there  exists  the  longest 
path  into  Fq  is  the  last  input.  In  case  of  ties  the  order  is  not  important. 

We  also  have  to  make  sure  the  variable  names  are  the  same  in  different  blocks  when  they  have 
to  be  bound  to  the  same  input.  For  instance,  consider  the  following  case:  we  have  three  nodes, 
Fo,  Fi,  and  F2,  with  F\  having  a  link  into  Fo,  and  F2  having  links  into  both  Fq  and  F\.  Then 
the  nodes  will  have  the  form  Fo  =  A xy.  /o,  F\  =  Xz.  /1,  and  the  application  order  will  be  F0F1F2. 
When  we  “hook  up”  Fo  and  F\  we  have  to  change  all  the  z’s  in  F\  to  y’s.  Equivalently,  we  could 
have  F\  =  Xy.  f\  and  use  a  /3  substitution  rule  that  allows  variable  capture. 

This  procedure  yields  a  DPCF  program  whose  circuit  semantics  quite  obviously  looks  the  same 
as  the  original  circuit,  and  therefore  has  the  same  size  and  depth.  □ 

Example  4.7.2  We  give  an  example  to  illustrate  the  algorithm  described  above.  Figure  4-M  shows 
the  block  structure  of  two  circuits.  Case  (a)  would  give  rise  to  the  following  program  Pa  (for 
simplicity ,  we  assume  j3  substitution  with  variable  capture,  as  discussed  above): 

F0  =  A  xy.  fo ,  Fi  =  Xzx.  /1,  F2  =  Xx.  f2 
Pa  =  Fo(FiF2)F3. 

Case  (b)  would  result  in  the  construction  of  the  following  P 

F0  =  A  xy.  fo,  Fi  =  Xyz.  fu  F2  =  Ay.  /2 

Pb  =  F0F1F2F3. 


Now  we  show  that  monotone  circuits  are  as  powerful  as  DPCF  for  a  certain  class  of  functions. 
For  our  applications,  we  do  not  need  the  equivalent  statement  for  De  Morgan  circuits  and  NPCF 
programs. 

Proposition  4.7.3  If  DPCF  can  compute  f  :  {_L,  tt}n—>tt,  then  monotone  circuits  can  compute 
fbool  with  &  circuit  whose  dimensions  are  the  same  as  the  circuit  semantics  of  the  DPCF  program. 

Proof:  Let  us  assume  the  function  /  is  not  constant,  otherwise  the  proof  is  trivial.  We  can  divide 
the  constructs  of  DPCF  into  sequential  and  parallel.  There  is  only  one  parallel  construct,  query. 
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The  sequential  constructs,  if  presented  with  an  undefined  input,  will  all  return  _L.  Therefore,  in 
order  to  obtain  any  result  when  computing  /  we  must  use  query. 

We  can  translate  the  tt  in  the  input  to  anything  else  we  would  like:  ff ,  any  integer.  We  cannot 
translate  J_.  The  functions  computable  in  DPCF  on  _L  and  any  other  non-bottom  input  are  the 
same:  AND,  OR,  constant  functions.  Since  we  must  use  query  to  make  any  progress,  and  since 
queries  on  _L  and  any  other  input  are  all  the  same,  it  doesn’t  matter  what  the  other  input  is.  So 
translating  the  tt  from  the  input  makes  no  difference.  We  would  simply  have  to  translate  back  at 
the  output. 

We  can  then  eliminate  the  unnecessary  translations  and  leave  only  the  queries.  The  circuit 
semantics  of  the  resulting  program  with  _L,  tt  inputs  changed  to  0,1  inputs  respectively,  is  a  valid 
monotone  circuit  computing  fbooh  ^ 

Note  that  we  can  consider  the  DPCF  program  for  /  to  be  conditional-free,  therefore,  according 
to  Propositions  4.5. 3, 4. 5. 4,  the  result  implies  that  a  boolean  circuit  can  compute  fbool  with  least 
the  same  efficiency  that  DPCF  can  compute  /  under  either  parallel  c-b-v  or  c-b-s. 


4.8  Applications 

Given  the  correspondence  we  have  established  between  DPCF,  NPCF,  and  boolean  circuits,  we  can 
use  strong  results  from  complexity  theory  to  prove  equivalent  statements  about  our  programming 
languages.  These  results  are  applicable  for  either  parallel  c-b-v  or  c-b-s  (c/.  discussion  at  the  end 
of  previous  section),  and  also  contain  implications  about  evaluation  on  the  PRAM. 

Improving  on  an  earlier  superpolynomial  bound  by  Razborov,  Tardos  [84]  proved  a  very  strong 
separation  result  between  monotone  and  De  Morgan  circuit  complexity: 

Theorem  4.8.1  [Tardos]  There  exists  a  polynomial  time  computable  monotone  function  whose 
monotone  complexity  is  exponential 

The  monotone  function  discussed  is  the  perfect  matching  function  [23],  which  takes  an  adjacency 
matrix  as  input  and  returns  1  if  the  graph  has  a  perfect  matching.  DPCF  and  NPCF  programs 
would  compute  the  same  function  on  { JL,  tt}71  inputs. 

Since  perfect  matching  is  in  P,  there  exists  a  De  Morgan  circuit  of  polynomial  size  for  it  [10]. 
By  Proposition  4.7.1  there  is  an  NPCF  program  for  it  which  does  a  polynomial  amount  of  work. 
Since  the  monotone  complexity  of  perfect  matching  is  exponential,  by  Proposition  4.7.3  any  DPCF 
program  for  it  will  do  an  exponential  amount  of  work.  Therefore,  we  have: 

Proposition  4.8.2  There  exists  a  function  computable  by  an  NPCF  program  with  polynomial  work , 
but  for  which  the  best  DPCF  program  does  exponential  work . 

By  Propositions  4.6. 1,4.6. 2,  the  result  can  be  stated  in  term  of  execution  time  on  the  PRAM: 
there  exists  a  function  computable  by  both  DPCF  and  NPCF  but  which  requires  exponentially 
more  time  to  execute  on  a  PRAM  for  DPCF  than  NPCF. 

A  strong  separation  result  between  monotone  and  De  Morgan  circuits  is  also  known  for  circuit 
depth  [72]. 

Theorem  4.8.3  [Raz  &  Wigderson]  There  is  a  monotone  function  in  NC 1  that  has  no  monotone 
NC  circuits . 
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The  function  referred  to  in  Theorem  4.8.3  is  a  variant  of  the  matching  function  (matching  of 
size  nj 3,  where  n  is  the  number  of  vertices  in  the  graph).  By  a  similar  argument  as  above  we  can 
apply  this  result  to  NPCF  and  DPCF  programs. 

Proposition  4.8.4  There  exists  a  function  computable  by  an  NPCF  program  in  logarithmic  time, 
and  for  which  the  best  DPCF  program  takes  more  than  poly  logarithmic  time. 

4.9  Discussion 

We  have  defined  a  new,  intensional  denotational  semantics  for  functional  languages,  circuit  se¬ 
mantics.  Circuit  semantics  associates  a  gate  with  each  basic  construct  of  the  language,  and  takes 
the  meaning  of  a  program  to  be  a  circuit.  The  dimensions  of  the  circuit  enable  reasoning  about 
running  time  and  work  required  for  execution.  We  have  established  circuit  semantics  as  a  parallel 
complexity  model,  by  comparing  it  to  the  time  and  work  required  to  execute  programs  under  sev¬ 
eral  parallel  evaluation  strategies,  and  in  the  PRAM  model.  We  have  also  used  circuit  semantics 
to  obtain  relative  intensional  expressiveness  results  for  parallel  extensions  of  PCF. 

We  have  shown  that  deterministic  query  is  intensionally  more  expressive  than  pifL,  which,  in 
turn,  is  intensionally  more  expressive  than  por  and  pif0.  Thus,  we  have  the  beginnings  of  a  hierarchy 
of  intensional  expressiveness  for  deterministic  parallelism.  In  the  process,  we  have  exhibited  lan¬ 
guages  which  are  extensionally  but  not  intensionally  equivalent.  The  constructs  por,  pif0 ,  and  pifL 
are  interdefinable  in  the  continuous  function  model  of  PCF.  However,  PCF  +  pifL  is  intensionally 
more  expressive  than  PCF  +  por  (or  pifQ).  A  natural  question  raised  by  this  is  whether  there  exists 
a  language  that  is  extensionally  more  expressive  but  intensionally  less  expressive  (on  the  common 
subset  of  computable  functions)  than  another  language.  The  case  of  the  Girard-Reynolds  system 
F  versus  Godel’s  system  T  might  be  an  example  of  this,  but  the  matter  is  not  settled  yet  (cf.  [20]). 

In  order  to  compare  deterministic  and  nondeterministic  query,  we  were  forced  to  make  an  as¬ 
sumption  about  having  the  ability  to  detect  undefined  inputs.  Though  somewhat  unaesthetic,  this 
assumption  allowed  us  to  view  the  question  as  a  similar  one  from  complexity  theory,  that  of  com¬ 
paring  monotone  and  De  Morgan  boolean  circuits.  After  establishing  a  connection  between  the 
dimensions  of  a  program  under  the  circuit  semantics  and  the  complexity  and  depth  of  a  corre¬ 
sponding  boolean  function,  we  were  able  to  show  that  nondeterministic  query  is  intensionally  more 
expressive  than  its  deterministic  counterpart:  it  can  lead  to  exponentially  faster  programs,  and  also 
programs  that  do  exponentially  less  work. 

Although  we  have  used  results  from  circuit  complexity,  we  have  not  “given  anything  back.”  It 
would  be  interesting  to  find  out  if  the  connection  between  DPCF  programs  and  monotone  circuits 
has  some'  wider  applicability  in  the  area  of  circuit  complexity.  It  seems  unrealistic  to  hope  that 
it  would  be  easier  to  prove,  say,  strong  lower  bounds  for  the  complexity  of  slice  functions  [87] 
by  examining  DPCF  programs  with  undefined  inputs.  But  perhaps  the  more  general  connection 
between  functional  parallel  programs  with  undefined  inputs  and  boolean  circuits  can  be  fruitfully 
exploited. 


Chapter  5 

Type  Inference 


This  chapter  marks  the  beginning  of  the  second  part  of  the  thesis.  After  our  explorations  of  circuit 
semantics  in  the  context  of  parallel  extensions  of  PCF,  we  return  to  CDSO  and  design  a  type 
inference  system  for  it.  Our  ultimate  intent  is  to  use  this  type  inference  system  to  analyze  PCF-like 
languages,  taking  advantage  of  the  intensional  information  provided  by  sequential  algorithms.  We 
achieve  this  goal  in  the  next  chapter,  where  we  introduce  a  high-level  lazy,  functional  language, 
show  how  to  translate  it  to  CDSO,  and  build  a  refinement  type  inference  system  on  top  of  our  type 
inference  system. 

We  first  present  some  general  considerations  in  designing  a  type  system  for  CDSO  in  Section  5.1. 
Section  5.2  discusses  CDSO  type  definitions  in  detail.  We  define  the  meaning  of  subtyping  and  in¬ 
tersection  types  for  ground  dcds  in  Section  5.3,  and  do  the  same  for  sequential  algorithms  in 
Section  5.4.  A  decision  procedure  for  subtyping  in  the  monomorphic  case  is  shown  in  Section  5.5. 
We  present  the  monomorphic  type  inference  rules  and  prove  soundness  of  monomorphic  type  in¬ 
ference  in  Section  5.6.  Next,  we  show  how  polymorphism  and  overloading  can  arise  in  CDSO  in 
Section  5.7,  and  describe  how  to  decide  subtyping  in  the  presence  of  type  variables  in  Section  5.8. 
Finally,  Section  5.9  gives  the  rules  for  type  inference  incorporating  polymorphism  and  overloading 
and  shows  soundness  of  the  extended  system. 


5.1  Issues  in  designing  a  type  system  for  CDSO 

The  types  in  CDSO  are  the  dcds’s.  The  original  intent  [5,  7]  was  that  CDSO  would  be  typechecked. 
The  meaning  of  typing  judgments  was  taken  to  be,  x  :  r  if  x  E  D(r ).  So  the  user  would  type  in  an 
expression  and  a  type,  and  the  system  would  check  if  the  expression  belongs  to  the  set  of  states  of 
that  type.  Unfortunately,  this  was  never  implemented. 

Our  intent  is  to  devise  a  type  inference  system.  The  motivation  for  this  is  twofold:  First,  CDSO 
is,  in  a  sense,  a  low-level  programming  language.  Writing  programs  in  CDSO  is  sometimes  difficult, 
and  a  type  inference  system  would  greatly  ease  the  task.  Second,  and  more  important,  sequential 
algorithms  form  an  intensional  semantics  for  sequential  programming  languages,  and  we  would  like 
a  way  of  extracting  the  intensional  information  present  in  an  algorithm.  We  can  imagine  sequential 
algorithms  as  an  intermediate  language  in  the  compilation  of  a  functional  language,  an  intermediate 
language  which  makes  many  kinds  of  analyses  of  program  properties  easier.  Our  type  inference 
system  will  be  the  foundation  on  which  we  build  our  program  analysis. 

There  are  several  implications  of  our  desire  to  build  a  type  inference  system  for  CDSO,  among 
them  the  need  for  a  definition  of  subtyping  for  dcds’s,  and  the  need  to  introduce  intersection  types. 
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We  discuss  each  in  turn,  also  pointing  out  peculiarities  of  CDSO  which  must  be  reflected  in  the 
type  system. 

If  we  take  the  original  meaning  of  typing  judgments  and  add  subtyping,  we  are  forced  to  say 
that  a  <  t  if  and  only  if  D(cr)  C  D(r),  i.e.,  we  get  a  system  in  which  subtyping  is  equivalent 
to  inclusion  of  sets  of  states.  So  a  supertype  will  be  “bigger”  than  a  subtype:  it  will  have  more 
cells,  more  values  per  cell,  and  a  “weaker”  (more  permissive)  enabling  relation.  It  turns  out  that 
this  notion  of  subtyping  does  not  accord  with  the  usual  notion  of  subtyping  from  object-oriented 
languages  [43].  Consider  the  following  definitions  of  dcds’s  for  points  and  colored  points: 

let  point  =  dcds 
cell  X  values  [..] 
cell  Y  values  [..] 
end; 

let  cPoint  =  dcds 
cell  X  values  [..] 
cell  Y  values  [..] 
cell  C  values  red,  green,  blue 
end; 

With  the  view  of  subtyping  as  inclusion  of  sets  of  states  it  would  be  the  case  that  point  <  cPoint. 
This  is  the  exact  opposite  from  what  would  happen  in  an  object-oriented  language.  For  example, 
the  same  types  defined  as  record  types  look  as  follows  (the  language  is  a  generic  record  language, 
similar  to  one  from  [42]): 

type  point  =  {  X  :  Int,  Y  :  Int  >; 

type  color  =  red  I  green  |  blue; 

type  cPoint  =  {  X  :  Int ,  Y  :  Int ,  C  :  color  } ; 

For  record  types,  a  subtype  is  “more  specific”  than  a  supertype,  that  is,  when  viewed  as  a  property, 
it  is  applicable  to  fewer  records,  and  so  cPoint  <  point . 

Another  problem  with  the  notion  of  subtyping  as  inclusion  of  sets  of  states  is  caused  by  the 
computation  model  of  CDSO.  The  model  is  one  of  incrementally  growing  a  state,  by  filling  an 
accessible  cell  with  a  value.  One  would  expect  that,  as  we  add  information,  the  type  would  decrease, 
because  we  are  becoming  more  specific.  However,  the  exact  opposite  would  happen:  given  the  state 
{X  =  3,  Y  =  4}  :  point ,  if  we  add  the  event  C  =  red,  we  get  a  cPoint ,  which  is  higher  in  the  type 
hierarchy. 

For  the  reasons  mentioned  above,  we  want  a  notion  of  subtyping  for  dcds’s  which  makes  them 
more  like  records.  The  difficulty  in  doing  this  comes  from  the  fact  that  we  cannot  offer  the  same 
guarantees.  A  record  with  type  cPoint  is  guaranteed  to  contain  all  the  fields  X,  Y ,  and  C  filled 
with  some  value.  That  way  we  can  “coerce”  it  to  be  a  point  by  throwing  away  the  color  field. 
In  CDSO,  with  its  computation  model  of  incrementally  growing  a  state,  we  cannot  offer  the  same 
guarantee  because  we  have  to  be  able  to  type  “incomplete”  information.  For  example,  we  have  to 
say  that  {C  =  red}  :  cPoint ,  even  though  it  does  not  have  the  X  and  Y  cells  filled. 

Our  solution  is  to  imagine  that  each  cell  in  a  dcds  definition  has  a  variant  as  a  value,  one 
of  whose  elements  is  the  special  value  f2,  and  the  rest  of  which  is  the  regular  list  of  values  from 
the  definition.  Recall  that  according  to  CDSO’s  operational  semantics  (see  Appendix  A.l),  asking 
for  the  value  of  a  cell  that  is  not  filled  in  a  state  produces  the  result  f2,  which  is  printed  by  the 
interpreter  as  a  blank: 
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#  {C=red} ; 
request?  X; 

— > 

request?  Y; 

— > 

If  we  take  this  view,  then  we  can  make  cPoint  a  subtype  of  point  because  we  can  coerce  any  state 
of  cPoint  to  one  of  point  by  throwing  away  events  involving  cell  C.  Cells  X,  Y,  might  be  filled 
with  the  special  value  fi,  but  they  are  guaranteed  to  be  there. 

In  general,  when  we  assign  a  type  to  a  state,  we  will  imagine  all  initial  cells  present  in  the  type 
but  not  filled  in  the  state,  filled  with  Q.  In  addition,  we  want  to  do  the  least  amount  of  filling 
in  necessary.  For  instance,  if  we  are  presented  with  the  state  {X  =  3},  we  will  assign  it  the  type 
point ,  because  we  have  to  fill  only  one  cell  with  fi,  rather  than  cPoint ,  when  we  would  have  to  fill 
two  cells.  The  formalization  of  these  ideas  does  not  take  the  form  presented  above  (we  will  not  be 
translating  CDSO’s  type  definitions  into  a  type  language  with  variants  and  the  value  fi)  but  this  is 
the  intuition  behind  the  definitions  in  this  chapter. 

Given  our  view  of  dcds’s  as  records,  we  can  easily  encode  records  in  our  language.  Fields  are 
cells,  and  the  type  of  a  field  becomes  an  enumeration  of  constituent  values.  In  fact,  dcds  are  much 
richer,  since  they  have  accessibility  conditions,  but  we  shall  not  be  using  accessibility  conditions  in 
any  essential  way  in  our  definition  of  subtyping. 

Aside  from  the  need  for  subtyping,  another  implication  of  our  decision  to  have  type  inference  is 
the  need  for  intersection  types.  All  dcds’s  are  user  defined,  and  there  is  no  restriction  on  different 
dcds’s  having  distinct  cell  names  and  values,  so  it  is  possible  to  have  a  state  belonging  to  several 
of  them. 

If  we  want  to  be  able  to  perform  type  inference  on  the  full  version  of  CDSO,  we  also  need  to 
have  polymorphism  and  overloading.  As  we  have  seen  in  Section  2.3.4,  we  can  write  algorithms 
with  generic  (i.e.,  variable)  cell  names  and  values.  A  fully  generic  algorithm  (such  as  the  already 
presented  identity  algorithm)  will  give  rise  to  a  polymorphic  type,  while  a  partially  generic  algorithm 
(cell  names  and/or  values  are  partially  specified)  will  give  rise  to  an  overloaded  type. 
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We  now  cover  dcds  definitions  in  more  detail.  There  are  two  kinds  of  dcds’s:  ground  and  higher- 
order.  The  ground  dcds’s  are  all  user-defined.  The  following  simplified  grammar  describes  their 
syntax  (the  full  grammar  can  be  found  in  Appendix  C.l): 


(ground) 
( dcds-decla ) 
(dcds) 
(component) 

(valueJist) 

(access) 

(enabling) 

(event) 


: :  =  (dcds-decla)  |  local  (dcds-decla)  in  (dcds-decla)  end 
: :  =  letrec  (dcds)  \  let  (dcds) 

::=  (ID)  =  dcds  {(component)}*  end 
: :  =  cell  (name)  values  (valueJist)  (access) 

|  graft  (dcds) .(name)  (access) 

: :  =  {(name)}4" 

: :  =  e  |  access  (enabling)  |  (access)  or  ( enabling ) 

: :  =  (event)  |  (event)  ,  (enabling) 

: :  =  (name)  =  (name) 
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We  have  already  defined  booleans  and  integers  in  Section  2.3.1.  We  provide  a  few  more  examples 
of  dcds  definitions  which  will  be  used  in  what  follows.  First  we  define  refinements  of  bool ;  intuitively, 
a  boolean  is  either  true  of  false: 

let  true  =  dcds  cell  B  values  tt  end; 
let  false  =  dcds  cell  B  values  ff  end; 

Next,  we  define  integer  lists.  This  is  another  example  of  the  use  of  recursion  and  grafting, 
similar  to  integer  streams  from  Section  2.3.1: 

letrec  intlist  =  dcds 

cell  EMPTY  values  true,  false 
graft  (int.l)  access  EMPTY  =  false 
graft  (intlist. 1)  access  EMPTY=false 
end; 

The  first  few  cells  of  intlist ,  together  with  their  access  conditions,  look  like  this: 

#  show  more  3  intlist; 

{ 

EMPTY  values  true,  false, 

(EMPTY. 1)  values  true,  false  access  EMPTY=false, 

((EMPTY. 1) .1)  values  true,  false  access  (EMPTY. 1) =f alse , 

(N.l)  values  [..]  access  EMPTY=false, 

((N.l).l)  values  [..]  access  (EMPTY. l)=f alse, 

(((N.l) .1) .1)  values  [..]  access  ((EMPTY. 1) .l)=false} 

Structurally,  intlist  is  quite  similar  to  the  lazy  natural  numbers.  There  is  only  one  initial  cell, 
EMPTY.  If  EMPTY  =  false ,  we  can  fill  the  value  of  the  first  integer  in  the  list,  N.l,  and  we  are 
allowed  to  fill  EMPTY. 1.  The  cells  of  the  form  EMPTY.l . . .  form  the  “backbone”  on  which  the 
actual  integers  of  the  list  attach  to,  like  vertebrae. 

Following  the  example  in  the  introductory  chapter,  we  refine  intlist  into  empty  lists,  lists  of  a 
single  element,  and  lists  of  two  or  more  elements.  Specifying  empty  and  singleton  lists  is  simple: 

let  empty_intlist  =  dcds 
cell  EMPTY  values  true 
end; 

let  one_intlist  =  dcds 
cell  EMPTY  values  false 

cell  (N.l)  values  [..]  access  EMPTY  =  false 
cell  (EMPTY.l)  values  true  access  EMPTY  =  false 
end; 

Defining  lists  of  two  or  more  elements  is  complicated  by  the  fact  that  we  need  an  intermediate 
dcds  definition,  yet  we  do  not  want  the  intermediate  definition  to  appear  ultimately  in  our  subtype 
hierarchy  and  be  used  for  type  inference,  so  we  use  a  local  definition: 
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local  letrec  part ial_int list  =  dcds 

cell  (EMPTY. 1)  values  true,  false  access  EMPTY  =  false 
cell  (N.l)  values  [..]  access  EMPTY  =  false 
graft  (partial_intlist .1)  access  EMPTY  =  false 
end 

in  let  many_intlist  =  dcds 
cell  EMPTY  values  false 

cell  (N.l)  values  [..]  access  EMPTY  =  false 
cell  (EMPTY. 1)  values  false  access  EMPTY  =  false 
cell  ((N.l).l)  values  [..]  access  (EMPTY. 1)  =  false 
graft  (partial_intlist . 1) 
end 
end; 

This  definition  does  what  we  want;  the  first  few  cells  look  as  follows: 

#  show  more  6  many_intlist ; 

{ 

EMPTY  values  false, 

(N.l)  values  [..]  access  EMPTY=false, 

(EMPTY. 1)  values  false  access  EMPTY=false, 

((N.l) .1)  values  [..]  access  (EMPTY. l)=false, 

((EMPTY. 1) .1)  values  true,  false  access  (EMPTY. l)=false, 

(( (EMPTY. 1) .1) .1)  values  true,  false  access  ( (EMPTY. 1) .l)=false, 

((N.l) .1)  values  [..]  access  (EMPTY. l)=false, 

(((N.l) .1) .1)  values  [. .]  access  ( (EMPTY. 1) .l)=false> 

The  language  of  our  types  is  given  by  the  following  grammar,  where  g  stands  for  the  names  of 
ground  dcds  definitions: 

r  ::=  0|o:|txt|t->t|  A[n~Tn]  |  { n..Tn }  |  Vet.  r 

We  shall  use  the  variables  r,  cr,  5,  to  range  over  types,  and  a,  j0,  ...,  to  range  over  type 

variables.  We  use  the  notation  A[Ti-rn]  for  intersection  types  and  {ti..t„}  for  overloaded  types; 
sometimes  the  range  of  the  subscript  will  be  specified  using  a  “comprehension”  notation,  A  [o*  I  •  •  •], 
and  {rj  |  . . .}. 

5.3  Ground  dcds 

In  what  follows  it  will  be  convenient  to  assume  that  a  dcds  M  is  given  by  a  tuple  (C,Vc,  Hc),  that 
is,  a  set  C  of  cell  names,  a  family  Vc  of  sets  of  values  for  each  cell  c  E  C,  and  a  family  of  enabling 
(accessibility)  relations  bc  for  each  cell.  This  is  slightly  different  from,  but  obviously  equivalent  to, 
Definition  2.2.1. 

5.3.1  Subtyping 

Intuitively,  there  are  two  cases  in  which  a  dcds  should  be  considered  a  subtype  of  another:  As  in 
the  example  of  points  and  colored  points,  a  subtype  could  be  an  extension  of  the  supertype,  adding 
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more  cells.  Also,  we  would  expect  the  refinements  of  bool  and  intlist  to  be  subtypes;  in  this  case, 
the  subtype  has  fewer  cells,  and  belongs,  in  a  sense,  to  a  partition  of  the  supertype.  Our  definition 
of  subtyping  formalizes  this  intuition. 

Since  all  types  are  user-defined,  the  subtyping  relation  and  the  types  we  assign  terms  will  vary 
depending  on  what  types  have  been  defined.  Our  definitions  will  be  predicated  by  a  set  of  defined 
types,  C. 

Definition  5.3.1  (Maximal  state)  A  state  x  is  maximal  if  A(x)  =  0. 

Definition  5.3.2  (Subtyping  for  ground  types)  Given  the  set  of  defined  types ,  C ,  and  two 

ground  dcds,  a  —  (Ca,  and  r  =  (<7T,  VJ,  \-Tc),  we  say  that  a  <  r  if  either  of  the  following 

hold: 

L  o  “ extends ”  r  (written  a  <e  r ),  i.e., 

(a)  CT  C  Ca 

(b)  Vc  G  CT.  Vf  C  VJ 

(c)  vc  g  cr.  n  =  k 

2 .  a  belongs  to  a  “partition”  of  r  ( written  a  <p  r),  i.e., 

(a)  Ca  C  Cr 

(b)  Vc  G  C*.  Vf  C  VCT 

(c)  Vc  G  C*.  K  =  K 

(d)  x  G  D(o)  =>  x  G  D(r)  and  any  maximal  state  x  G  D(a)  is  also  maximal  in  D(r) 

(i.e.,  fiy  G  D(a),y  D  x  G  D(r),y  D  x  for  any  x  G  D{o)). 

3.  There  exists  a  finite  chain  £i  ,...,Cn  in  C  such  that  o  <*  Cl  Cn  where  <* 

ranges  over  {<e5f^p}- 

Example  5.3.3  Given  our  dcds  definitions  of  the  previous  section ,  we  have  the  following  subtype 
relation : 

cPoint  <e  point  empty  Jntlist  <p  intlist 

true  <p  bool  one  Jntlist  <p  intlist 

false  <p  bool  many  Jntlist  <p  intlist. 

It  is  clear  why  cPoint  is  a  subtype  by  extension  of  point:  it  has  more  cells,  as  many  values  and 
the  same  accessibility  condition  for  the  common  cells.  For  the  subtyping  by  partition ,  the  only 
interesting  case  is  2d.  As  can  be  verified ,  all  maximal  states  in  the  subtypes  are  maximal  in  the 
supertypes. 

Example  5.3.4  We  provide  an  example  of  subtyping  in  which  case  3  of  the  definition  comes  into 
play.  Consider  the  following  dcds: 

let  empty^colored  -  dcds 
cell  EMPTY  values  true 
cell  C  values  red,  green,  blue 
end; 
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We  have  empty-colored  <e  emptyjintlist  and  empty Jntlist  <p  intlist .  However ,  empty-colored 
and  intlist  are  incomparable  given  only  the  first  two  cases  of  the  definition.  Using  case  3  we  can 
conclude  that  empty-colored  <  intlist. 

The  definition  of  subtyping  can  be  a  little  confusing,  because  a  ground  dcds  with  more  states 
can  be  either  above  or  below  one  with  fewer  states.  This  might  seem  to  imply  that  we  can  have 
a  contradictory  situation,  where  two  non-identical  dcds’s  are  simultaneously  above  and  below  each 
other.  The  following  propositions  show  that  not  to  be  the  case. 

Proposition  5.3.5  It  is  not  possible  to  have  simultaneously  o  <e  r  and  r  <p  o,  when  o  ^  r. 

Proof:  By  the  definition  of  subtyping,  in  order  for  both  cr  <e  r  and  r  <p  o  to  hold,  the  following 
conditions  must  hold: 

1.  cT  c  ca 

2.  Vc  €  CT.  Vf  =  VCT 

3.  Vc  G  CT.  =  b* 

4.  D(r)  C  D{a) 

5.  all  maximal  states  of  r  are  maximal  in  o 

It  is  the  last  condition  that  cannot  be  met.  There  are  three  possible  ways  of  adding  more  cells  to 
r  to  get  o:  (i)  the  extra  cells  are  not  initial,  (ii)  they  are  initial,  or  (iii)  there  is  a  combination  of 
initial  and  non-initial  cells.  The  last  two  cases  clearly  cannot  work,  because  we  can  easily  extend 
any  maximal  state  in  r  by  filling  one  of  the  initial  cells.  For  case  (i),  it  would  have  to  be  the  case 
that  the  extra  cells  in  a  are  enabled  by  states  of  r  which  are  not  maximal,  AND  the  maximal  states 
of  r  contain  no  enabling  substate  for  cells  of  a.  But  that  implies  that  the  states  that  enable  a  o  cell 
cannot  be  extended  in  r  to  become  maximal,  i.e.,  they  are  already  maximal.  This  is  contradictory. 
□ 

It  might  also  seem  strange  to  build  transitivity  into  the  definition  of  subtyping,  as  we  have 
done.  The  reason  for  this  is  that  defining  subtyping  between  dcds’s  with  incomparable  sets  of  cells 
{i.e.,  given  cr,  r  such  that  neither  Ca  C  Cr  nor  Cr  C  Ca)  is  difficult  to  do  without  making  the 
relation  trivial.  The  way  we  chose  to  define  subtyping  makes  it  dependent  on  the  set  C  of  dcds’s 
defined  in  the  system.  This  implies  that  it  is  possible  for  two  dcds’s  to  be  unrelated,  but  to  become 
so  with  the  addition  of  a  new  dcds  definition  (cf.  Example  5.3.4).  We  now  show  that  subtyping  is 
a  partial  order,  listing  first  two  obvious  lemmas  about  the  properties  of  <e  and  <p. 

Lemma  5.3.6  <e  is  transitive  and  (vacuously)  anti-symmetric. 

Lemma  5.3.7  <p  is  a  partial  order. 

Proposition  5.3.8  <  is  a  partial  order. 

Proof:  Reflexivity  is  obvious  from  Lemma  5.3.7.  Transitivity  is  built  into  the  definition  (case  3). 
Anti-symmetry  is  somewhat  more  complicated  to  establish.  Suppose  we  have 

a  <*  Ci  5;*  *  *  *  <*  (n  <*  ^  and 

T  — *  £l  G"? 
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where  each  <*  ranges  over  {<e,<P}-  Then  it  is  the  case  that  CaflCT  /  0.  Departing  from  a, 
for  instance,  it  is  impossible  to  lose  all  of  Ca  through  a  chain  of  subtypes,  since  the  accessibility 
conditions  do  not  change,  therefore  all  initial  cells  must  exist  in  the  subtypes.  By  cases  lc  and  2c 
of  the  definition,  it  follows  that  Vc  €  Caf]CT.  Vf  =  VCT  and  \-rc  =  h£.  We  will  now  show  this  is 
not  possible  assuming  a  ^  r. 

When  subtyping  by  extension,  we  can  introduce  new  dependencies,  but  only  among  the  newly 
introduced  cells.  When  subtyping  by  partition,  the  only  way  in  which  we  can  “eliminate”  cells  from 
the  supertype  is  if  they  depend  on  some  initial  cell,  which  also  appears  in  the  subtype,  but  with 
fewer  values  (not  the  ones  enabling  the  missing  cells;  cf.  part  2d  of  the  definition).  The  upshot  of 
this  is  that  starting  with  a,  say,  we  can  eliminate  some  of  Ca  by  subtyping,  only  by  reducing  the 
value  set  of  at  least  one  cell  that  is  left  over.  This  will  not  work  because  of  our  earlier  requirement 
of  matching  values  for  the  surviving  piece.  On  the  other  hand,  we  can  simply  extend  <r,  thus 
making  Ca  C  CT.  But  then  when  we  try  to  go  the  other  way  (by  subtyping  from  r  to  a),  we  have 
to  eliminate  Cr  \C°,  which  cannot  be  done  except  by  losing  values  in  Cc  as  we  argued  earlier. 
Therefore,  it  must  be  the  case  that  a  —  r.  □ 

To  summarize,  the  following  rules  about  subtyping  hold  (all  typing  rules  are  also  listed  in 
Appendix  A. 2  for  ease  of  reference): 

(Sub-Refl)  a  <  a 

,  s  a  <  t  t  <  5 

(Sub- Trans)  - a  <  8 


To  this  we  add  a  rule  about  subtyping  for  products,  which  cannot  be  derived  from  our  definition 
of  subtyping  for  ground  dcds. 


(Sub-Prod) 


<7  x  <  T\  <72  <  72 
<J\  X  (72  <  71  X  7"2 


We  now  formalize  the  notion  of  coercion.  Given  cr  <  r,  we  would  expect  there  to  be  a  way  of 
uniformly  transforming  any  state  of  a  into  one  of  r.  In  particular,  it  should  be  the  case  that  not 
all  states  of  o  get  mapped  into  the  empty  state,  because,  otherwise,  any  type  can  be  coerced  into 
any  type  (since  the  empty  state  belongs  to  every  type).  So  we  want  our  coercions  to  satisfy  this 
extra  condition.  We  will  use  ip,  ip  to  range  oyer  coercions. 


Definition  5.3.9  (Coercion)  A  coercion  from,  a  to  t,  where  a,  r  are  two  types,  is  a  function 
tp  :  a— >r,  which  is  a  restriction  on  the  range  of  the  identity  on  a,  ida,  i.e.,  range  ip  C  range  id„, 
and  such  that  it  does  not  map  all  states  of  a  to  the  empty  state,  i.e.,  range  ip  ^  0,  unless  a  is  the 
empty  dcds. 


Proposition  5.3.10  [Subtyping  and  ground  coercions]  If  c  <  t,  then  there  exists  a  coercion 
<p  :  a  -t  r  such  that  if  x  G  D(a )  then  <p(x)  €  D(t). 


Proof:  By  cases  depending  on  the  kind  of  subtyping. 

1.  a  <e  r.  Then  <p(x)  —  {c  =  v£x\c  &  CT}. 

2.  a  <pr.  Then  ip  is  the  identity  on  a. 

3.  There  exists  a  finite  chain  Cl i •  •  •  > Cn  in  £  such  that  a  <*  <*  <*  ( n  <*  r,  where 

each  <*  ranges  over  {<e,  <p).  Then  construct  a  coercion  <pt  for  each  piece  of  the  chain  and 
ip  =  ipn+l  °  •••  o  ipi. 
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The  first  two  cases  obviously  give  rise  to  coercions.  For  the  last  case,  recall  from  the  proof  of 
Proposition  5.3.8  that  Ca  f|  CT  ^  0,  therefore  the  composition  of  the  coercions  will  also  satisfy  the 
coercion  requirements.  □ 

We  formalize  what  we  mean  by  a  subtype  hierarchy.  We  will  use  S  to  range  over  subtype 
hierarchies.  Our  definitions  will  now  be  predicated  by  S. 

Definition  5.3.11  (Subtype  hierarchy)  A  subtype  hierarchy  is  a  partially  ordered  set  (£,<), 
where  C  is  the  set  of  defined  types ,  and  <  is  the  subtype  relation. 

We  are  finally  in  a  position  to  begin  formalizing  some  of  our  intuitions  from  Section  5.1.  We 
define  a  notion  of  minimal  type  with  fewest  assumptions:  If,  given  a  state  x ,  it  is  the  case  that 

x  G  D(cr)  for  several  o,  we  want  to  call  the  lowest  such  a  the  minimal  type  of  x  (let  us  defer, 

for  now,  the  possibility  that  there  is  no  single  lowest  a  to  the  next  section,  where  we  introduce 
intersection  types).  However,  if  x  G  D(a)  and  x  G  D(r)  with  a  <e  r,  then  x  would  require  more 
assumptions  to  be  considered  a  member  of  a,  and  so  we  would  choose  r  as  the  minimal  type  with 
fewest  assumptions.  Note  that  we  reason  about  assumptions  in  a  very  indirect  fashion,  as  compared 
to  the  presentation  of  Section  5.1. 

Definition  5.3.12  (Minimal  type  with  fewest  assumptions)  Given  a  subtype  hierarchy  S 
and  a  state  x ,  we  say  that  r  is  the  minimal  type  with  fewest  assumptions  of  x,  if  x  G  D(r) 

and  r  <e  £  and  x  G  £>(£),  and  r  is  the  lowest  type  with  this  property. 

Example  5.3.13  Assuming  the  dcds  definitions  of  the  previous  section ,  the  minimal  type  with 
fewest  assumptions  of  {X  =  3}  is  point ,  and  the  minimal  type  with  fewest  assumptions  of  {C  =  red} 
is  cPoint. 

The  refinement  types  are  the  subtypes  by  partition  (<p).  We  are  particularly  interested  in  the 
case  when  enough  partitions  of  a  type  are  defined  so  that,  taken  together,  they  “cover”  the  type. 

Definition  5.3.14  (Complete  partition)  A  set  of  types  {<j\, . . .  ,on}  is  called  a  complete  parti¬ 
tion  of  a  type  r  t/ Vi.  c?i  <v  r  and  D(r)  =  Ui  D{a%)  and  Vi  ^  j .  Oi  and  <jj  are  incomparable. 

Example  5.3.15  {true,  false}  is  a  complete  partition  of  bool,  and  {empty,  one,  many}  is  a  com¬ 
plete  partition  of  intlist.  If,  for  instance,  true  were  not  defined,  then  the  partition  of  bool  would 
not  be  complete. 

5.3.2  Intersection  types 

In  the  previous  section  we  ignored  the  possibility  that  a  state  might  have  several,  incomparable, 
minimal  types  with  fewest  assumptions.  To  handle  such  cases,  we  introduce  intersection  types.  In 
general,  the  type  of  a  state  will  be  an  intersection  of  types. 

Definition  5.3.16  (Minimal  type)  Given  a  subtype  hierarchy  S  and  a  state  x,  we  say  that 
A[ri*-rn]  is  the  minimal  type  of  x  if  the  t$  are  all  the  incomparable  minimal  types  of  x  with  fewest 
assumptions. 

We  now  present  our  replacement  to  the  original  notion  of  typing  judgments  in  CDSO.  The  idea 
is  to  say  that  x  :  r  even  in  cases  when  x  0  D(r),  provided  there  exists  a  type  o  below  r  for  which 
x  :  a  does  hold.  Then  we  can  coerce  x  to  something  that  is  in  D(r).  So  we  still  use  the  original 
notion  of  belonging  to  the  set  of  states,  but  only  to  get  a  “foothold”  into  the  subtype  hierarchy, 
and  from  there  we  apply  the  sub  typing  rules. 
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(And- Intro) 

(And-Elim) 

(Sub-And-R) 

(Sub-And-L) 


x  :  o  i  •  •  •  x  :  an 
x  :  f\[(Ti..an] 

x  :  t\[oi..on) 
x  :  Oi 

Vi  a  <  Tj 

o-  <  A[n-rn] 

A[ai..an]  <  Oi 


Figure  5.1:  Typing  and  subtyping  for  intersection  types 


Definition  5.3.17  (Meaning  of  typing  judgments  for  ground  states)  Given  a  subtype  hi¬ 
erarchy  S  and  a  state  x,  we  say  that  x  :  t,  if  a  <  t,  where  o  is  the  minimal  type  of  x. 

Example  5.3.18  The  minimal  type  of  {C  =  red}  is  cPoint  and  not  point,  because  it  is  the  case 
that  {C  =  red}  €  cPoint  and  there  is  no  type  below  cPoint  for  which  this  holds.  On  the  other 
hand,  {C  =  red}  g  point.  However,  we  do  have  {C  =  red}  :  point,  since  cPoint  <  point,  and  so 
we  can  coerce  {C  =  red}  into  a  state  of  point  (in  this  case  it  is  $). 

Our  typing  and  subtyping  rules  for  intersection  types  are  summarized  in  Figure  5.1. 

In  the  general  setting  of  intersection  types,  we  can  state  and  prove  a  property  which  we  would 
expect  to  hold  of  any  typing  system  based  on  dcds’s,  i.e.,  that  incremental  computation  decreases 
the  type  of  a  state.  If  we  extend  a  ground  state  by  a  new  event,  we  want  the  type  of  the  new  state 
to  be  at  most  as  high  as  before. 

Theorem  5.3.19  Incremental  computation  does  not  increase  the  minimal  type  of  a  ground  state. 

Proof:  Suppose  we  have  two  states,  x,  y,  such  that  x-^Cc  y,  and  x  :  o,y  :  t,  where  o,  r  G  S,  our 
subtype  hierarchy.  We  want  to  show  that  the  minimal  type  of  y  is  not  above  the  minimal  type  of 
x. 

By  the  definition  of  typing  judgments,  the  minimal  types  of  x,  y  will  have  forms  f\[o\..on]  and 
A[ti  -.tto],  respectively.  We  need  to  show  that  Vcr*  3 Tj.  tj  <  cq. 

Pick  a  at .  Suppose  it  is  incomparable  to  all  the  Tj.  But  then  it  should  be  included  among  them 
as  part  of  the  minimal  type  of  y.  Now  suppose  there  is  a  t?  such  that  a ,  <  Tj.  It  has  to  be  the  case 
that  Oi  <e  Tj,  otherwise  Oi  would  have  been  listed  as  one  of  the  minimal  types  of  y,  instead  of  Tj. 
But  then,  since  x  (E  Tj,  Oi  should  not  be  among  the  minimal  types  of  x;  Tj  should  be  listed  instead. 
Therefore,  there  is  a  Tj  such  that  Tj  <Oi .  □ 


5.4  Sequential  algorithms 

Higher-order  dcds’s,  whose  states  are  the  sequential  algorithms,  are  treated  somewhat  differently 
from  ground  dcds’s.  Intersection  types  are  also  present,  and  the  meaning  of  a  typing  judgment 
does  not  change,  however,  an  analogue  of  Theorem  5.3.19  does  not  hold  for  higher-order  states. 
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5.4.1  Subtyping 

As  in  the  case  of  products,  we  cannot  apply  the  definition  of  subtyping  for  ground  dcds’s  to  the 
higher-order  case.  Instead,  we  have  to  impose  an  ordering  from  outside.  The  subtyping  rule  for 
arrow  is  conventional  and  for  the  same  reason;  we  want  it  to  encode  the  idea  of  substitutability.  Even 
though  sequential  algorithms  are  more  than  just  functions  (they  contain  intensional  information), 
we  should  be  able  to  collapse  intensional  substitutability  to  the  extensional  one. 


&2  ^  Ol  Ti  <  7"2 

(7 1  y  T\  <  <72  —>T2 

If  a  higher-order  type  is  a  subtype  of  another,  it  should  still  be  possible  to  coerce  states  of  the 
former  into  states  of  the  latter. 


(Sub- Arrow) 


Proposition  5.4.1  [Subtyping  and  higher-order  coercions]  If  <  <72~>T2,  then  there  exists 

a  coercion  <p  :  (0*1  t\)  ->(<72  ^2)  such  that  if  x  E  D(cf\  — >t\)  then  <p(x)  E  D(o2  ^2)* 


Proof:  By  Sub- Arrow  we  have  <72  <  <t\  and  t\  <  72  and  by  Proposition  5.3.10  there  exist 
coercions  <pin  :  02-+ <J\  and  <pout  :  r\— >T2.  We  define  an  event-by-event  version  of  <p,  called  <^e, 
based  on  the  kinds  of  events  of  o\  tx: 

1.  xd  =  valof  c,  where  d  E  CTl,  x  E  D(oi)  and  c  E  Cai.  Then 


<pe(xd  =  valof  c) 


0,  if  (pin{x)  ±  x,  or  (pin({c  =  *})  =  0,  or  (pout({c'  =  *})  =  0 
{xc1  =  valof  c},  otherwise 


The  notation  c  =  *  means  fill  cell  c  with  any  reasonable  value;  we  are  interested  if  a  particular 
coercion  elides  all  events  involving  c. 

2.  xd  =  output  t/,  where  c'  E  CTl,  t/  E  Fj1,  and  x  E  D(cri).  Then 


<pe(xd  =  output  vf) 


0,  if  <pin(x)  ^  x ,  or  ^owt({c'  =  w'})  =  0 
{rrc'  =  output  ?/},  otherwise 


Then  y?(ar)  =  |J(maP  Te  #)•  D 

Example  5.4.2  Suppose  we  are  given  the  following  state  of  bool  cPoint: 

{  OX  =  valof  B,  {B=tt}X  =  output  3,  {B=ff}X  =  output  4, 

OC  =  valof  B,  {B=tt}C  =  output  red  }. 

Since  bool  — >  cPoint  <  true  — ^  point,  there  should  be  a  coercion  that  transforms  the  above  state  into 
one  of  true  — ^  point.  The  result  of  applying  the  coercion  generated  by  Proposition  5-4-1  is: 

{  OX  =  valof  B,  {B=tt}X  =  output  3  >. 

The  meaning  of  typing  judgments  in  the  higher-order  case  is  the  the  same  as  in  the  ground  case: 
again  we  use  “belongs  to  the  set  of  states  of”  as  a  foothold,  and  then  use  the  subtyping  rules. 

Definition  5.4.3  (Meaning  of  typing  judgments  for  algorithms)  Given  an  algorithm  a ,  we 
say  that  a:  o^r  if  a  E  D(af  — »  rf),  where  o'  — ►  r'  <  a  r. 
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We  note  that  this  definition  has  an  interesting  implication,  as  established  by  the  following 
proposition. 

Proposition  5.4.4  If  a  :  o~+t,  then  given  an  input  x  :  a,  a.x  :  r. 

Proof:  Assume  a  G  D(cr' where  o’ -*V  <  o -¥t.  Given  an  x  :  a,  i.e.,  x  G  D(ain)  for  some 
crjn  <  <t,  we  have  crln  <  a  <  a'.  Therefore,  we  can  coerce  the  x  to  a  a',  so  a.x  G  D(t'),  which 
implies  a.x  :  t.  □ 

5.4.2  Intersection  types 

In  general,  an  algorithm  will  have  an  intersection  type.  Suppose  the  type  of  the  inputs  for  algorithm 
a  is  /\[cti  ..an],  and  the  outputs  have  type  A[Ti--rm]-  What  should  the  type  of  the  algorithm  be  in 
that  case?  We  would  not  want  it  to  be  Aki-^n]  - * ►  A[ri"TmL  because,  as  would  be  apparent  by 
examination  of  the  Sub-Arrow  rule,  we  would  not  then  be  able  to  apply  the  algorithm  to  an  input 
of  type  0{.  A  natural  choice,  then,  would  be  Ak*  A[ti--tto]  I  *  £  l..n],  but  this  is  inconvenient, 
because  of  the  nested  intersections. 

We  will  construct  a  canonical  type  for  a  with  all  the  intersections  at  the  outer  level.  The  type 
of  a  will  be  Ak«  ~>Tj  I  *  G  l..n,  j  G  l..m].  The  justification  for  this  is  provided  by  the  following 
subtyping  rule  which  establishes  an  equivalence  between  our  canonical  type  and  the  natural  type 
one  would  expect  for  the  algorithm: 

(Sub-And-Dist)  Ak-^Tl  ••  CT->T„]  <  cr~*  A[rl--Tn] 

The  equivalence  is  due  to  the  fact  that  we  can  deduce  the  opposite  of  Sub-And-Dist  using 
Sub-And-L,  Sub-Arrow,  and  Sub-And-R.  By  Sub-And-L  we  have  A[Ti--Tn]  <  n  for  each  i. 
By  Sub- Arrow  it  follows  that  a— *  A[ri--Tn]  <  cr— : for  each  i.  Therefore,  by  Sub-And-R,  we 
conclude  that  a  — >  A[ri--rn]  <  Ak- ••  cr— ^rn]. 

Definition  5.4.5  (Canonical  type  for  algorithms)  Given  an  algorithm  a,  with  minimal  input 
type  Aki-^n]  and  minimal  output  type  A  [ri  ■  •  Tm] »  then  the  canonical  type  of  a  will  he  given  by 
Aki  -»■  Tj  |  i  G  l..n,  j  G  l..m]. 

Our  canonical  type  provides  a  valid  type  for  an  algorithm  a.  If  a  :  Ak*  Tj  I  *  €  l..n,  j  G  l..m], 
then  it  is  the  case  that  a  :  o,  — >  Tj  for  each  i  and  j.  In  addition,  we  can  apply  a  to  any  input  :  a,. 
for  any  i.  and  we  can  conclude  that  a.x  :  Tj,  for  all  j ,  therefore,  a.x  :  A[Ti”Tnj- 

5.5  Decidability  of  monomorphic  subtyping 

It  is  not  at  all  obvious,  at  first  glance,  how  one  can  decide  when  a  dcds  a  is  a  subtype  of  another, 
t.  There  are  three  problematic  requirements:  checking  when  the  set  of  cells  of  a  is  a  subset  of 
the  set  of  cells  of  r,  checking  that  all  states  of  a  are  also  states  of  r,  and  verifying  the  maximality 
requirement  for  sub  typing  by  partition.  Fortunately,  dcds’s  are  infinite  regular  trees  [24],  so  it  is 
possible  to  decide  each  of  the  above  properties. 

First,  note  that  deciding  subtyping  is  easy  when  we  have  dcds’s  with  a  finite  number  of  cells. 
The  interesting  case  occurs  with  dcds’s  that  have  an  infinite  number  of  cells.  There  are  two  ways 
in  which  such  dcds’s  can  arise:  a  dcds  definition  with  a  cell  name  that  contains  an  infinite  interval 
tag,  such  as  R.[..]  (cf.  Appendix  C.l),  or  by  recursion. 
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The  first  case  is  easy  to  handle,  since  the  representation  is  very  compact.  We  can  check  if 
another  dcds  contains  cells  of  the  type  i?.[..]  simply  by  comparing  the  endpoints  of  the  intervals. 
For  the  second  case,  we  can  view  a  dcds  as  a  tree:  all  cells  are  leaves  and  grafting  creates  a  subtree. 
A  recursive  dcds  (with  possibly  other  recursive  dcds’s  embedded  in  it  by  grafts)  is  an  infinite  regular 
tree.  The  access  conditions  on  cells  pose  no  problem,  as  they  also  have  a  regular  structure. 

We  shall  first  present  an  algorithm  for  deciding  subset-of  on  dcds  states.  The  same  algorithm 
can  be  used,  with  only  minor  modifications,  to  decide  inclusion  of  sets  of  cells.  Note  that  in  both 
subtyping  by  extension  and  by  partition,  the  accessibility  condition  is  required  to  be  the  same 
on  the  common  subset  of  cells  of  the  subtype  and  supertype.  Without  much  additional  effort, 
however,  we  can  decide  subset-of  in  the  general  setting  of  different  access  conditions.  Consequently, 
we  present  a  definition  of  “weaker-than”  for  accessibility  conditions.  A  weaker  access  condition  is 
more  permissive,  i.e.,  easier  to  satisfy.  We  view  an  enabling  relation  for  a  cell  c  (hc)  as  consisting 
of  a  family  £  of  sets  of  events  which  enable  c  (also  written  £  h  c). 

Definition  5.5.1  (Weaker-than  access  condition)  Given  two  enabling  relations  on  a  cell  of  a 
dcds ,  we  say  that  £\  h  c  <  £2  he  (read  hic  is  weaker  than  h2cj  i/f  VE2  E  £ 2 .  E  £\.  E\  C  E2. 


The  basic  idea  is  to  prove  subset-of  by  induction  on  the  size  of  cell  names.  A  graft  will  always 
increase  the  length  of  a  cell  name.  As  we  “unroll”  an  infinite  dcds  one  layer  at  a  time,  all  cell  names 
will  get  longer.  If  a  dcds  has  not  been  able  to  generate  the  cells  from  another  dcds,  we  need  not 
continue  checking  past  a  point  determined  by  the  length  of  those  cells. 


Definition  5.5.2  (Cell  name  length)  The  length  of  a  cell  name  c  is  the  number  of  tags  in  it. 

In  order  to  define  the  concept  of  unrolling  a  dcds,  we  have  to  fix  a  representation  for  dcds’s. 
This  is  probably  easiest  to  accomplish  by  using  the  actual  Standard  ML  representation  from  our 
implementation: 


datatype  eva  =  Plain  of  cell  *  value  list  *  access  list 

I  Delay  of  ideds  *  ((cell  *  value  list  *  access  list)  -> 

(cell  *  value  list  *  access  list)) 


and  ideds  =  Nonrec  of  eva  list 

I  Rec  of  eva  list  *  ((cell  *  value  list  * 

(cell  *  value  list  * 
((cell  *  value  list  * 
(cell  *  value  list  * 


access  list)  -> 
access  list))  * 
access  list)  -> 
access  list)) 


The  internal  representation  of  a  dcds  (ideds)  is  a  list  of  cell,  values,  and  access  conditions  (eva) 
packaged  as  either  a  non-recursive  or  recursive  dcds.  It  is  not  important  what  the  representation 
for  cells,  values,  and  access  conditions  is  for  our  purposes.  In  the  case  of  a  recursive  dcds,  we  also 
package  two  functions:  one  to  generate  the  initial  recursive  step,  and  another  for  all  subsequent 
steps  (the  reason  for  having  two  functions  is  not  relevant).  A  eva  can  be  either  a  plain  triplet  of  a 
cell,  a  list  of  values,  and  access  conditions,  or,  when  we  graft  a  recursive  dcds  into  another,  a  pair 
of  the  grafted  dcds  and  a  function  that  applies  the  graft  tag.  In  particular,  note  that  something 
packaged  as  a  non-recursive  dcds  at  top-level  can  actually  have  recursive  dcds’s  embedded  in  it. 

The  depth  of  a  dcds  is  a  measure  of  how  deeply  embedded  a  recursive  graft  is.  In  the  case 
of  a  non-recursive  dcds,  it  is  simply  the  number  of  eva  entries,  whereas  for  recursive  dcds’s,  we 
increment  the  number  of  eva  entries  by  one.  In  both  cases,  we  increment  the  depth  by  one  plus 
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the  maximum  depth  of  any  recursive  graft.  We  present  the  actual  Standard  ML  code  for  the  depth 
function,  since  it  is  very  simple. 

Definition  5.5.3  (Dcds  depth)  The  depth  of  a  dcds  a  is  given  by  the  following  mutually  recursive 
functions: 

fun  findDelay  []  =  0 

I  findDelay  ( (Delay (d,_) ): :1)  =  1  +  max(depth  d,  findDelay  1) 

I  findDelay  ((Plain(_,_,.)) : :1)  =  findDelay  1 

and  depth  (Nonrec  cvas)  =  (length  cvas)  +  (findDelay  cvas) 

I  depth  (Rec(cvas )  =  1  +  (length  cvas)  +  (findDelay  cvas) 

Example  5.5.4  We  apply  the  definition  of  depth  to  some  of  the  previously  presented  dcds’s: 

depth(bool )  =  1 
depth(intlist)  =  3 
depth(many  Jntlist)  =  8 

Note  that  the  actual  number  returned  by  depth  is  not  meaningful  in  and  of  itself.  It  counts  the 
number  of  cva  elements  with  the  addition  of  a  large  “penalty”  for  embedded  recursive  grafts.  Most 
importantly,  it  is  designed  to  work  in  conjunction  with  the  unrolling  of  a  dcds,  which  essentially 
retraces  the  computation  of  depth ,  by  listing  the  cva  elements  of  the  dcds  up  to  some  specified 
limit. 

Definition  5.5.5  (Dcds  unrolling)  Unrolling  a  dcds  a  d  times  builds  a  cva  list  of  a,  such  that 
any  embedded  recursion  at  depth  d  gets  to  apply  its  recursive  graft  at  least  once.  A  simplified  version 
of  our  implementation  (it  omits  the  raising  of  exceptions  with  error  messages  in  certain  cases)  is 
given  in  Figure  5.2. 

The  function  iterate  takes  a  single  cva  element  and  the  functions  which  apply  the  tags  from  a 
recursive  dcds,  and  generates  the  specified  number  of  new  cva  elements.  The  function  UstCva  takes 
a  list  of  cva  elements  and  emits  at  least  the  specified  number  of  elements  from  the  list  (assuming 
the  list  is  long  enough).  In  the  case  of  a  nonrecursive  dcds  definition,  UstCva  will  list  exactly  the 
specified  number  of  cva  elements.  For  recursive  definitions,  it  will  list  at  least  the  number  specified. 

We  have  already  seen  some  examples  of  unrolling  a  dcds  at  the  beginning  of  this  chapter.  The 
command  show  more  d  a  from  our  CDSO  interpreter  actually  performs  unroll(d,a).  We  refer  the 
reader  to  Section  5.2  for  examples  of  usage  of  this  function  on  intlist  and  many  Jntlist. 

We  list  some  properties  of  the  definitions  of  unrolling  and  depth,  which  will  be  needed  in 
establishing  the  correctness  of  our  decision  procedures. 

Lemma  5.5.6  Given  a  dcds  a  with  depth(a)  =  da ,  the  computation  of  unroll (d#,  a)  results  in  the 
computation  of  unroll(d,r),  where  r  is  the  innermost  embedded  recursive  dcds  in  a,  and  d  >  dT, 
where  dT  is  the  depth  of  r. 

Proof:  By  examination  of  the  code  for  unroll  it  is  apparent  that  it  simply  retraces  the  computation 
of  depth.  When  presented  with  a  nonrecursive  dcds,  unroll  subtracts  one  from  its  running  total  for 
each  plain  cva  element.  When  it  encounters  an  embedded  recursive  dcds  r,  unroll  calls  iterate  on 
it  with  an  argument  equal  to  the  remaining  count,  which  will  be  at  least  as  large  as  the  depth  of 
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fun  iterate  (0, _  =  [] 

I  iterate  (i,  first,  fl,  fi)  (c,v,a)  = 

if  (first)  (*  first  iteration:  apply  fl  *) 
then  if  (i  =  1)  then  [(c,v,a)] 

else  let  val  newCva  =  fl(c,v,a) 

in  (c, v, a) :: (newCva: : (iterate(i-2, false, fl,fi)  newCva)) 
end 

else  let  val  newCva  =  fi(c,v,a) 

in  newCva: : (iterate  (i-1 , false, f 1 ,fi)  newCva) 
end 

fun  listCva  (0,  _)  =  [] 

I  listCva  (i,  [])  =  [] 

I  listCva  (i,  cva::l)  = 

(case  cva  of 

(Plain(c,v,a))  =>  [(c,v,a)]  0  (listCva(i-l,l)) 

I  (Delay (Rec (cvaList, f l,f i) ,f))  => 

let  val  recPart  =  map  (iterated, true, fl,fi)) 

(listCva(i , cvaList) ) 
val  ordered  =  flatten  recPart 
in  (map  f  ordered)  @  (listCva(i-l,l)) 
end) 

fun  unroll  i  (Nonrec  cvaList)  =  listCvad,  cvaList) 

I  unroll  i  (Rec (cvaList ,fl ,fi) )  = 

let  val  recPart  =  map  (iterated, true, fl,fi))  (listCvad, cvaList)) 

in  flatten  recPart 

end 


Figure  5.2:  Definition  of  unrolling 
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r,  depending  on  its  exact  position  in  the  list  of  cva  elements.  Suppose  r  is  last  in  the  list  of  a  dcds 
definition  a  (least  favorable  position  from  our  point  of  view);  further,  suppose  there  are  i  plain  cva 
elements,  and  that  dcds  r  has  depth  dr.  Then  the  depth  of  a  will  be  da  =  i  +  2  +  dr.  The  call 
unroll(da,cr)  will  then  lead  to  the  call  unroll(dT  +  2 ,  r),  which  establishes  our  result.  □ 

Proposition  5.5.7  Given  a  dcds  a  with  depth(a)  =  da ,  the  computation  of  unroll(da,c r)  results 
in  a  cva  list  such  that  the  innermost  embedded  recursive  dcds  in  a  gets  to  apply  its  recursive  graft 
at  least  once. 

Proof:  According  to  Lemma  5.5.6,  the  call  unroll(da ,  a)  will  result  in  a  call  unroll(d ,  r )  with  r  the 
innermost  embedded  recursive  dcds,  and  d  >  dr,  the  depth  of  r.  Then  it  suffices  to  examine  what 
happens  when  unroll  is  applied  to  a  simple  recursive  dcds  with  an  argument  equal  to  the  depth  of 
that  dcds. 

Suppose  r  is  a  simple  recursive  dcds  (i.e.,  it  contains  no  embedded  recursive  dcds’s)  and  it 
has  depth  dr.  Unroll(dT,r)  will  call  iterate  with  argument  dT  for  every  cva  element  in  r.  There 
must  be  at  least  one  cva  element,  hence  dT  >  2  and  therefore,  iterate  will  get  to  apply  at  least  one 
recursive  graft.  □ 

From  the  previous  two  results,  it  is  easy  to  derive  the  following  Corollary,  which  will  prove 
useful  in  what  follows. 

Corollary  5.5.8  Given  a  dcds  a  with  depth(a)  =  da,  the  computation  of  unroll(da  +  n,cr)  results 
in  a  cva  list  such  that  the  innermost  embedded  recursive  dcds  in  a  gets  to  apply  its  recursive  graft 
at  least  n+1  times . 

When  given  two  infinite  dcds’s  a  and  r,  and  having  to  decide  whether  D(a)  C  Z)(r),  the  idea 
is  to  unroll  each  dcds  up  to  some  point  and  check  the  finite  cva  lists  for  subset-of.  We  need  to 
be  careful  to  unroll  enough  of  each  dcds  in  order  to  make  the  comparison.  Consider  the  following 
example: 

letrec  Ml  =  dcds  letrec  M2  =  dcds 

cell  (N.l)  values  [..]  cell  (N.l)  values  [..] 

graft  (Ml .1)  cell  ((N.l).l)  values  [..] 

end;  graft  (M2.s) 

end; 

Dcds  Mi  has  cells  of  the  form  N.l . . .  with  one  or  more  l  tags  while  M2  has  cells  of  the  form 
N.L s ...  or  N.l.l.s . . .  with  zero  or  more  s  tags.  If  we  only  unrolled  M\  and  M2  once  and  made 
the  comparison,  we  would  conclude  that  D{M\)  C  D(M2),  which  is  false.  We  need  to  unroll  M\ 
enough  times  to  allow  it  to  differentiate  itself  from  the  nonrecursive  part  of  M2,  and  only  then  can 
we  unroll  M2  enough  times  to  enable  it  to  generate  all  of  Mi’s  cells,  if  it  can.  With  this  in  mind, 
we  present  our  decision  procedure  for  subset-of. 

Algorithm  5.5.9  (Deciding  subset-of  for  dcds  states)  Given  two  dcds’s,  a  and  r,  to  deter¬ 
mine  if  D  (a)  C  D{t)  do  the  following: 

1.  Let  da  ==  depth(a)  and  dT  =  depth(r). 

2.  Let  L  —  unroll(dr,r). 

3.  Let  nameT  =  length  of  longest  cell  name  in  L. 
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4-  Let  Lfj  =  unroll(da  +  nameT,  a). 

5.  Let  namea  =  length  of  longest  cell  name  in  La. 

6.  Let  Lt  —  unroll(dT  +  namea,  r). 

7.  Compare  the  finite  La,  LT  for  subset-of. 

Example  5.5.10  We  show  the  algorithm  in  operation  on  the  two  dcds’s  defined  above,  M\  and 
M2,  deciding  whether  D(M\)  C  D{M2).  First  we  find  the  depth: 

depth(Mi)  =  2,  and  depth{M2)  =  3. 

Unrolling  M2  3  times  produces: 

#  show  more  3  M2; 

{ 

(N.l)  values  [..], 

((N.l).s)  values  [..], 

(((N.l) .s) .s)  values  [..], 

((N.l).l)  values  [..], 

(((N.l) .1) .s)  values  [..], 

((((N.l) .1) .s) .s)  values  [..]> 

Therefore,  the  length  of  the  longest  cell  name  in  the  exposed  portion  of  M2  is  4 ■  Now  we  unroll  Mi 
6  times  which  produces: 

#  show  more  6  Ml; 

{ 

(N.l)  values  [..], 

((N.l) .1)  values  [..], 

(((N.l) .1) .1)  values  [ . . ] , 

((((N.l) .1) .1) .1)  values  [..], 

(((((N.l) .1) .1) .1) .1)  values  [..], 

((((((N.l) .1) .1) .1) .1) .1)  values  [..]} 

The  longest  cell  name  here  is  6,  so  we  next  unroll  M2  9  times,  in  order  to  give  it  a  chance  to 
generate  all  cell  names  in  this  portion  of  M\: 

#  show  more  9  M2; 

{ 

(N.l)  values  [..], 

((N.l).s)  values  [..], 

(((N.l) .s) .s)  values  [..], 

((((N.l) .s) .s) .s)  values  [..], 

(((((N.l) .s) .s) .s) .s)  values  [..], 

((((((N.l) .s) .s) .s) .s) .s)  values  [..], 

(((((((N.l) .s) .s) .s) .s).s) .s)  values  [..], 

((((((((N.l) .s) .s) .s) .s) .s) .s) .s)  values  [. .] , 

(((((((((N.l) .s) .s) .s) .s) .s) .s) .s) .s)  values  [..], 
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((N.l).l)  values 

( ( (N. 1) . 1) . s)  values 

( ( ( (N • 1) .1) .s) .s)  values 

(((((N.l) .1) .s) .s) .s)  values 

( ( ( ( ( (N - 1) .1) .s) .s) .s) .s)  values 

(((((((N-l) .1) .s) .s) .s) -s) .s)  values 

((((((((N.l) .1) .s) .s) .s) .s) .s) .s)  values 

(((((((((N.l) .1) .s) .s) .s) .s) .s) .s) .s)  values 

((((((((((N.l) .1) .s) .s) .s) .s) .s) .s) .s) .s)  values  [..]> 

Comparing  the  finite  unrollings  of  Mi  and  M2  is  enough  to  convince  us  that  D{M\)  %  D(M2)- 

Theorem  5.5.11  Given  two  dcds’s ,  a  and  r,  Algorithm  5.5.9  will  claim  D(cr)  C  D(r)  iff  it  is  the 
case  that  D{a)  C  D(r). 

Proof:  Completeness  is  relatively  simple  to  establish.  Suppose  D(a)  C  D(r).  The  only  issue  is 
whether  we  have  unrolled  r  enough  times  to  let  it  generate  all  cva  elements  in  La.  But  r  is  being 
unrolled  dT  +  namea  times,  which,  according  to  Corollary  5.5.8,  will  make  the  deepest  embedded 
recursive  dcds  in  r  apply  its  tag  at  least  name a  + 1  times.  But  clearly,  there  is  nothing  to  be  gained 
by  unrolling  r  any  further,  because  any  new  cell  names  we  generate  will  be  strictly  longer  than  the 
cell  names  in  La.  Therefore,  if  D(a)  C  D(r),  all  cva  elements  in  La  will  have  been  generated  by 
that  point. 

Correctness  is  more  difficult  to  establish.  We  need  to  prove  that  if  according  to  Algorithm  5.5.9 
D(a)  C  D(r),  then  that  is  indeed  the  case.  To  do  this,  we  need  to  show  that  a  cannot  possibly 
generate  a  cva  element  not  in  r. 

For  simplicity,  let  us  assume  that  each  recursive  dcds  in  a  applies  a  different  tag.  The  proof  is 
by  contradiction.  Let  c  be  the  shortest  cell  name  generated  by  a  which  is  not  in  r.  Then  c  will 
have  the  form: 

c  =  name  +  kat  +  ranat , 

where  name  is  the  part  of  the  cell  name  that  does  not  contain  any  of  the  tags  applied  by  recursive 
steps,  kat  denotes  the  addition  of  ka  tags  t  which  are  the  same  as  the  ones  from  the  recursive  step 
but  already  existed  in  c  prior  to  any  unrolling,  and  ranat  denotes  nG  unrollings,  each  of  which 
applies  rat  tags  at  a  time.  We  have  used  the  addition  symbol  (+)  to  denote  the  application  of  a 
tag  (i.e.,  N.l.l  =  N  +  2/). 

First 'we  will  show  that  nG  >  %.  In  step  2  of  the  algorithm  we  unroll  r  for  dr  times,  so 
nameT  >  1,  and  in  step  4  we  unroll  a  for  da  +  namer  times.  By  Corollary  5.5.8,  this  means  that 
the  deepest  recursion  in  a  applies  at  least  nameT  +  1  >  2  tags.  But  any  cell  in  La  is  in  r,  and  so 
our  counterexample  cell  c  must  have  more  than  2  tags. 

Since  c  is,  by  assumption,  the  shortest  counterexample,  its  immediate  precursors  are  in  r.  Let 
us  denote  them  c',c".  They  have  the  following  form: 

cf  =  name  +  kat  +  ra(na  —  1  )£, 
c"  =  name  +  kat  +  ra(na  —  2  )t. 

Since  c',  cn  are  in  r,  there  must  be  a  way  to  construct  them  in  r.  Because  the  first  4  steps  of 
the  algorithm  unroll  a  enough  times  to  differentiate  it  from  the  nonrecursive  part  of  r,  d  and  cn 
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must  be  generated  by  r  recursively: 

d  —  name  +  krt  +  rTnTi, 
c"  =  name  +  kTt  +  rT(nT  —  k)t , 

where,  as  before,  kTt  denotes  the  tags  coming  from  the  nonrecursive  part  of  r,  and  rTt  is  the  number 
of  tags  placed  by  one  unrolling  of  r.  We  now  show  that  if  r  can  recursively  generate  two  successive 
cells  of  cr,  then  it  can  generate  any  subsequent  one. 

By  taking  the  difference  of  d  and  df  on  both  the  a  and  r  side,  we  get  the  number  of  tags  that 
are  different.  We  use  the  notation  d  —  c"  for  this: 

d  —  c"  =  rat  =  rTkt. 

But  this  means  that  what  takes  a  one  unrolling  to  accomplish,  can  be  done  with  k  unrollings  of  r, 
and  so  we  can  actually  express  c  in  r: 

c  =  name  +  kTt  +  rT(nT  +  k)t. 

But  then  c  is  not  a  counterexample  after  all  and  we  have  established  a  contradiction.  This  means 
that  r  can  generate  all  cells  in  <7,  and  that,  indeed,  D(a)  C  D(r).  □ 

The  same  kind  of  argument  can  carried  out  in  the  more  general  setting  of  dcds’s  with  recursive 
components  which  apply  the  same  tags.  The  difference  is  that  a  cell  name  like  c  will  have  a  more 
complicated  general  form,  since  recursive  contributions  can  come  from  several  places. 

We  make  use  of  Algorithm  5.5.9  in  deciding  whether  all  maximal  states  in  the  subtype  are 
maximal  in  the  supertype.  In  particular,  we  will  use  the  finite  lists  of  cva  elements  La,Lr . 

Algorithm  5.5.12  (Deciding  maximality  requirement)  Given  a,  r ,  such  that  D(a)  C  D(r) 
compute  La^Lr  as  in  Algorithm  5.5.9  and  check: 

1.  All  initial  cells  in  r  are  in  a  (with  Vf  C  VCT ). 

2.  All  cells  enabled  by  the  common  values  (from  both  a  and  r)  of  the  initial  cells  in  r  are  in  a, 
and  so  on 7  recursively. 

The  algorithm  terminates  because  we  are  only  examining  the  finite  L(7,Lr.  Completeness  is 
again  simple  to  establish.  For  correctness  we  can  use  an  argument  similar  to  the  one  from  the  proof 
of  Theorem  5.5.11.  As  before,  we  are  examining  finite  pieces  of  infinite  dcds’s  that  already  have 
the  relevant  structure  of  the  whole. 

Example  5.5.13  We  show  how  to  decide  the  maximality  requirement  for  oneJntlist  and  intlist. 
Since  oneJntlist  is  not  recursive ,  we  do  not  need  to  unroll  intlist  as  many  times  as  dictated  by 
Algorithm  5.5.9 .  It  is  enough  to  unroll  intlist  its  depth  (3)  plus  the  longest  cell  name  in  oneJntlist 
(1).  The  lists  we  get  then  are: 

#  show  more  3  one_intlist; 

{ 

EMPTY  values  false, 

(N.l)  values  [..]  access  EMPTY=false, 

(EMPTY. 1)  values  true  access  EMPTY=false} 


#  show  more  4  intlist; 
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{ 

EMPTY  values  true,  false, 

(EMPTY. 1)  values  true,  false  access  EMPTY=false, 

((EMPTY. 1) .1)  values  true,  false  access  (EMPTY. l)=false, 

(( (EMPTY. 1) .1) .1)  values  true,  false  access  ( (EMPTY. 1) .1) -false, 

(N.l)  values  [..]  access  EMPTY=false, 

((N.l).l)  values  [..]  access  (EMPTY. l)=false , 

(((N.l).D.l)  values  [..]  access  ((EMPTY. 1)  .l)=false, 

((((N.l) .1) .1) .1)  values  [..]  access  (( (EMPTY. 1) .1) ,l)=false> 

There  is  only  one  initial  cell  in  intlist,  EMPTY  and  it  is  also  initial  in  oneJntlist  with  a  smaller 
set  of  values.  The  two  cells  enabled  by  { EMPTY  =  false}  in  intlist  are  (N.l)  and  ( EMPTY.l ) 
and  they  are  also  enabled  in  oneJntlist,  again  with  a  smaller  set  of  values.  Nothing  is  enabled 
by  {EMPTY  =  false,  (N.l)  =  n,  (EMPTY.l)  =  true}  in  intlist  and  there  are  no  more  cells  in 
oneJntlist,  so  the  maximal  states  in  oneJntlist  are  indeed  maximal  in  intlist  as  well. 


5.6  Monomorphic  type  inference 


Type  inference  systems  which  contain  subtyping  normally  incorporate  the  notion  of  substitutability 
using  a  subsumption  typing  rule: 


(Sub) 


a  :  a  a  <  r 
a  :  r 


We  will  not  have  this  rule  explicitly  in  our  system,  because  it  would  not  be  syntax-directed.  Instead 
it  will  be  absorbed  into  the  other  rules. 

Our  type  inference  system  for  expressions  is  shown  in  Figure  5.3.  The  form  of  the  rules  deserves 
some  explanation.  We  consider  the  rule  for  application.  As  noted  before,  in  general,  an  algorithm 
will  have  an  intersection  type.  Suppose  algorithm  a  :  f\[a\^  t\  ..  an  rn\.  By  applying  And-Elim 
we  can  deduce  a:  for  each  i.  Suppose  algorithm  b  :  A[^i--^m]-  Again,  by  And-Elim,  b  :  Sj 

for  each  j.  But  then  for  each  j  for  which  it  is  the  case  that  there  exists  an  i  such  that  Sj  <  a j, 
we  can  apply  the  subsumption  rule  to  either  the  type  of  a  or  b  to  get  an  application  argument 
of  the  right  type,  so  we  can  obtain  a. b  :  We  then  collect  all  the  T{  s  for  which  we  can  do  this 

in  an  intersection  as  the  final  type  of  the  application.  The  other  rules  have  a  similar  form.  For 
instance,  the  rule  for  fixpoint  also  incorporates  subsumption.  If  a  :  cr*  — with  Oi>T^  then  by 
Sub- Arrow  and  by  subsumption  a  :  Therefore,  fix(a)  :  a*.  Using 

And-Elim,  And-Intro  we  can  do  this  for  each  cr*  that  matches  the  condition. 

Now  we  present  our  type  inference  algorithm. 


Algorithm  5.6.1  (Monomorphic  type  inference)  Given  an  expression  e,  Am(e)  is  defined  as 
follows: 

1.  If  e  is  a  ground  state,  x,  we  match  it  to  all  dcds  in  the  subtype  hierarchy,  and  construct  its 
minimal  type,  /\[&i.-(Jn]- 

2.  If  e  is  an  algorithm  a,  or  equivalently  a  higher- order  state  x,  we  collect  all  input  cells  and 
values,  X{n,  and  get  its  minimal  type,  I\[gi..gu].  We  do  the  same  for  the  outputs,  xout,  and 
we  get  the  minimal  type  A[ri*-rm]-  The  type  of  the  algorithm  will  be  the  canonical  type : 


A ..  Ol  ->Tm  ..  an~>Tl  ..  (Tn  Tm\ 
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(App) 

a  ■  Ah-*’7*  I  i  €  l..n]  b  :  A[<Mm] 

a.b :  Ah  |  Bj.  <  cr*] 

(Comp) 

a  :  Ah-****  1  *  €  l-«]  b  :  A 1  j  e  l--h 
a|6  :  Ah  |  rj  <  t*] 

(Fix) 

o  :  Ah^n  1  *  €  l..n] 
fix  (a)  :  Ah  1  >  Ti] 

(Curry) 

a  :  A[h  |  i  €  l..n] 

curry(a )  :  Ah^^h7*  1  *  £  l..n] 

(Uncurry) 

a  :  Ah->cri->Tj  1  *  ^  l-n] 
uncurry(a)  :  A[h  x  |  *  €  l..n] 

(Pair) 

a  ■  Ah  ~4ri  1  *  €  L.n]  6  :  Ah  “Kj  1  j  €  l..m] 
<  a,  b  >  :  Ah  ~*(Ti  x  0)  I  <  cr*] 

(Prod) 

AM  b:  A[t[~tU] 

(a,  b)  :  Ah  x  Tj  |  i  €  1-n,  j  G  l..m] 

Figure  5.3:  Monomorphic  type  inference  rules 
3.  If  e  is  a  combinator  expression ,  we  apply  the  inference  rules. 


Theorem  5.6.2  (Soundness  of  monomorphic  type  inference)  If  Am(e)  =  r  then  e  :  r. 

Proof:  By  induction  on  the  structure  of  e: 

1.  e  =  x.  If  e  is  a  ground  state,  we  assign  it  its  minimal  type,  / \[ai..an ],  so  r  =  /\[ai..an]  and 
indeed  e  :  r. 

2.  e  =  a  or  e  =  x  higher-order.  When  e  is  a  an  algorithm,  or  equivalently,  a  higher-order  state, 

we  get  the  minimal  type  for  the  input  and  output  dcds  and  construct  a  canonical  type  with 
intersections  at  the  outermost  level,  \  i  E  l..n\.  It  is  the  case  that  for  each 

a  :  Oi  — >  Tj,  because  a  E  D(ai  — >Ti).  Then  a  :  f\[ai  —>Ti\i£  l..n]. 

3.  e  =  a.b.  By  the  induction  hypothesis,  a  :  I  i  E  l..n]  and  b  :  A[^i*-^m]*  According 

to  the  definition  of  typing  judgments,  in  the  case  of  a,  for  each  i,  a  E  D(a,i~ >r/)  with 

<  cr^r,  and,  in  the  case  of  6,  for  each  j,  b  E  D(£'),  where  <  Sj.  For  the  i  such 
that  there  exists  a  j  with  Sj  <  c^,  we  have  <  a*  <  cr^.  Then  there  exists  a  coercion 

Pin  •  But  then  it  will  be  the  case  that  a.b  E  D(r/),  therefore  a.b  :  r% .  Doing  this  for 

every  i  that  satisfies  the  condition,  we  get  a.b  :  A [ri  I  Sj  < 

4.  e  =<  a,  b  >.  By  the  induction  hypothesis,  we  have  a  :  I  *  E  l..n]  and  also 

b  :  /\[dj  Q  |  j  E  l..m].  For  i  such  that  there  exists  j  with  <5j  <  we  obtain  a  coercion 
Pin  :  Sj^di.  Since  a  :  cr j  — by  the  definition  of  typing  judgments,  a  E  with 
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<7i  <  a\  and  t[  <  t*.  Similarly,  b  G  £>(£'•  ->•  (j)  with  <5j  <  £'•  and  (j  <  Cj-  Then  for  some  ®  :  °%- 
<  a,b  >  .x  €  £>(Tj-  x  £'•),  which  implies  <  a,b  >  .x  :  n  x  £j.  Then  <  a,b>  :  Oi  ->(n  x  Cj). 
Doing  this  for  all  j  that  meet  the  requirement,  yields  the  desired  type. 

5.  e  =  curry(a).  By  the  induction  hypothesis,  a  :  AK0*  Xai)~*Ti  I  *  G  l..n].  By  And-Elim,  for 

each  i,  a  :  (cq  x  cr^)  — ►  Tj.  This  means  that  a  €  D((sx  s')  —>  t )  with  (s  X  s')  — *  t  <  (cr j  X  <7^)  ->  r*. 
This  implies,  by  Sub- ARROW,  that  <Tj  x  a\  <  s  x  s'  and  t  <  rt.  Further,  by  Sub-Prod,  we 
have  Oi  <  s  and  cr'  <  s'.  But  then,  by  Sub- Arrow  again,  s-ts'—tt  <  a*— >■  <y[-^ r*.  Since 
curry(a)  €  D(s -)•  s' ->  t),  this  implies  that  a  :  cr'  ^  r*.  Doing  this  for  all  i  yields  the 

desired  type. 

6.  Composition  and  fixpoint  are  similar  to  application  and  pair.  Uncurry  and  product  are  similar 
to  curry. 

□ 

Completeness,  in  the  sense  of  always  deducing  the  lowest  type,  is  not  possible  in  this  system  for 
undecidability  reasons.  The  user  may  define  a  certain  type  hierarchy  which  requires  us  to  decide, 
for  instance,  whether  an  algorithm  terminates,  in  order  to  assign  it  the  lowest  type. 

Example  5.6.3  We  presented  a  CDSO  algorithm  for  boolean  negation  in  Figure  2.3.  We  illustrate 
our  monomorphic  type  inference  algorithm  on  the  expression  not.{B  =  tt}.  When  typing  not,  we 
collect  all  input  cells  and  their  values,  and  all  output  cells  and  their  values.  Let  us  assume  that  the 
only  dcds’s  defined  in  our  system  are  bool  and  intlist.  The  inputs  are  cell  B  with  values  tt,ff ,  and 
likewise  for  the  outputs.  The  only  matching  dcds  is  bool,  so  we  conclude: 

not :  bool  —tbool,  {B  =  tt}  :  bool. 

By  App  we  then  conclude: 

not.{B  =  tt}  :  bool. 


5.7  Polymorphism  and  overloading 

We  have  seen  in  Section  2.3.4  how  to  write  algorithms  with  generic  (i.e.,  variable)  cell  names  and 
values.  Variable  names  begin  with  the  special  symbol  When  both  cell  and  value  references  in 
an  algorithm  are  variable,  we  get  a  polymorphic  type.  When  only  one  or  the  other  is,  or  their  shape 
is  constrained  in  some  way,  we  get  an  overloaded  type.  The  other  way  of  getting  overloaded  types 
is  to  have  a  missing  cell  or  value  name,  so  that  we  only  know  either  the  cell  name  or  the  value.  In 
that  case,  as  well,  we  are  reduced  to  matching  what  we  have  to  the  entire  subtype  hierarchy,  which 
might  result  in  different,  incompatible  matches. 

We  have  already  encountered  the  polymorphic  identity  in  Section  2.3.4.  Figure  5.4  shows  some 
more  examples  of  polymorphic  algorithms  and  an  example  of  an  overloaded  algorithm.  The  first 
projection  is  almost  the  same  as  the  polymorphic  identity,  except  it  finds  its  input  in  the  left  side 
of  a  product.  Conditional  is  an  example  of  a  mix  of  generic  and  nonvariable  cell  names  and  values. 
The  input  to  cond  is  of  the  form  ( bool  xa)xa  (which  explains  the  tags  on  the  cells)  and  the  output 
has  type  a.  The  last  example  shows  overloading.  Minus  is  an  algorithm  that  will  work  on  any  dcds 
which  has  some  initial  cells  whose  values  are  pairs  of  an  integer  and  something  else.  For  instance, 
if  we  had  the  following  dcds’s  defined  in  the  system, 
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let  fst  =  algo 
request  $C  do 
valof  ($C.l)  is 
$V :  output  $V 
end 
end 
end; 

let  minus  =  algo 
request  $C  do 
valof  $C  is 

($V.$W) :  output  (~$V.$W) 
end 
end 
end; 


let  cond  =  algo 
request  $C  do 

valof  ((B.l).l)  is 

tt:  valof  (($C.2).l)  is 
$V :  output  $V 
end 

ff:  valof  ($C.2)  is 
$W :  output  $W 
end 

end 

end 

end; 


Figure  5.4:  First  projection,  conditional,  and  an  overloaded  algorithm 

let  fractions  =  dcds  let  series  =  dcds 

cell  R  values  ([..].[!..])  cell  (R.[0..])  values  ([..].[1..]) 

end ;  end ; 

minus  should  work  on  both,  even  though  they  have  disjoint  sets  of  cells. 

When  typing  an  algorithm  with  variable  cell  and  value  references,  we  have  to  decide,  first,  if 
it  is  a  polymorphic  or  an  overloaded  reference  (polymorphic  references  have  no  constraints  on  the 
shape  of  the  value,  excluding  product  tags,  and  both  cell  and  value  are  variable),  and  second,  we 
need  to  look  for  matches  between  variables  used  in  the  input  and  output. 

For  example,  in  the  case  of  the  polymorphic  identity,  we  have  $C,  $V  in  both  input  and  output, 
so  assume  the  input  has  type  a  and  the  output  /?,  the  matching  stage  will  conclude  that  a  /3, 
and  the  final  type  will  be  Va.  a  — >  a.  The  first  projection  is  similar:  we  have  $(7.1,  $V  in  the  input 
and  $(7,  $F  in  the  output,  which  gives  rise  to  the  types  a  x  f3  and  7  respectively.  The  matching 
stage  will  conclude  that  a  H  7,  so  the  resulting  type  is  Va.  a  x 

Polymorphic  types  will  be  handled  in  the  standard  way,  using  instantiation  and  generalization 
rules.  There  will  be  difficulties,  however,  with  instantiating  polymorphic  types.  In  general,  we  will 
have  to  solve  a  constraint  satisfaction  problem.  This  is  discussed  in  more  detail  in  the  next  section. 

For  overloaded  types  we  do  not  want  to  use  “meet”  to  put  together  the  various  branches,  because 
an  algorithm  with  an  overloaded  type  will  not  simultaneously  have  all  the  types  in  the  branches. 
It  depends  on  particular  instantiations.  Therefore,  we  introduce  new  notation. 

Definition  5.7.1  (Overloaded  types)  An  algorithm  a  has  type  {a\  ->t\  ..  an  — >rn}  (abbreviated 
{o'i  ^  T{  |  i  E  l..n}),  when  there  exist  instantiations  of  the  variable  cell  and  value  names  in  a  so 
that  a:  cri->Ti  for  each  i.  The  instantiations  do  not  have  to  be  the  same  for  each  i . 

Note  that  it  is  possible  to  have  a  ground  state  that  is  overloaded,  but  it  does  not  make  computational 
sense,  and  so  we  will  disallow  it. 

We  add  the  following  subtyping  rules:  one  that  relates  different  overloaded  types,  and  one  that 
relates  intersection  and  overloaded  types. 

(Sub-Over)  {oi^n  |  i  e  l..n}  <  {5j  (i  I  j  €  l..m}  \/j.  3 i.  n  <  6j  -» (? 
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(Sub-Meet-Over)  A \  i  e  l..n]  <  \  i  e  l..n} 

Note  that  we  do  not  treat  overloaded  types  like  unions.  If  we  had,  we  would  have  had  a  rule  of  the 
form  — y  Tj  <  {&i  — >  T{  |  i  6  l..n},  but  this  is  wrong  from  the  substitutability  point  of  view. 

Given  the  above  definition,  and  our  earlier  type  declarations,  we  want  to  say  that: 

minus  :  {/ r  actions  -»  fractions,  series-*  series}. 

When  it  comes  time  to  apply  an  overloaded  type  we  have  to  be  careful  because  we  have  no 
control  over  what  is  in  the  various  branches.  The  only  thing  we  know  is  that  the  branches  all 
have  the  same  shape.  Therefore,  it  is  possible  for  an  input  to  match  several  branches.  Since  the 
algorithm  does  not  simultaneously  have  all  the  types  in  its  branches,  we  would  have  then  to  union 
the  output  types  of  the  matching  branches.  Thus  we  introduce  union  types. 

Definition  5.7.2  (Union  types)  Given  two  types ,  <j,t,  the  union  of  a  and  r  (written  \l[o,r\)  is 
the  least  upper  bound  of  a,  r  in  the  subtype  hierarchy . 

We  do  not  introduce  union  types  in  their  full  generality.  In  particular,  we  will  not  have  unions 
belong  to  any  reported  type.  Rather,  they  will  be  used  internally.  When  the  union  of  two  types 
does  not  exist  in  the  subtype  hierarchy,  this  will  result  in  a  type  error. 


5.8  Subtyping  with  polymorphic  types 

When  adding  polymorphism  to  our  subtyping  system,  it  is  not  enough  to  introduce  rules  of  the 
form  a  <  a  and  a  <  a.  The  problem  is  that  there  may  be  several  possible  instantiations  of  a, 
but  that  only  some  meet  the  requirement.  Consider  applying  an  algorithm  of  type  (a  x  a)  — >  a  to 
true  x  false.  By  App,  it  must  be  the  case  that  true  x  false  <  ax  a,  which  by  Sub-Prod  implies 
true  <  a  and  false  <  a.  If  we  just  instantiate  a  to,  say,  true,  the  whole  thing  fails.  So,  in  general, 
what  we  need  to  do  is  solve  a  constraint  satisfaction  problem.  In  our  example  there  is  a  very  simple 
solution,  a  6ooZ,  so  the  resulting  type  is  bool . 

Algorithm  5.8.1  (Subtyping  polymorphic  types)  Given  the  need  to  satisfy  a  <r  do  the  fol¬ 
lowing: 

1.  Recursively  break  down  a  and  r  using  the  subtyping  rules ,  until  we  arrive  at  subtyping  between 
ground  dcds  and  type  variables. 

2.  Collect  all  constraints  on  type  variables . 

3.  Pick  a  type  variable.  Suppose  it  is  a.  Collect  all  constraints  on  a  from  the  list.  They  will 
have  the  form: 

&  ^  Ti , . . . ,  a  <  rn ,  and  a  >  <ji  , . . . ,  ol  >  am . 

If  any  of  the  constraints  have  right  hand  sides  involving  type  variables  go  back  to  previous 
step  and  pick  another  a.  If  all  type  variables  have  constraints  with  other  type  variables  then: 

•  If  all  right  hand  sides  of  all  constraints  of  all  type  variables  are  all  type  variables ,  then 
attempt  to  unify  them. 

•  If  some  right  hand  sides  involve  non  variables,  then  choose  those  first  for  next  step. 
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5.  Unify  all  the  Ti  and  all  the  oj.  This  may  generate  new  constraints  for  some  type  variables.  If 
unification  fails  we  fail . 

6.  Let  the  new  constraints  be  a  <  rf  and  a  >  af .  If  it  is  not  the  case  that  af  <  rf  we  fail. 
Else ,  for  all  7  in  the  subtype  hierarchy  between  a*  and  r*,  make  the  substitution  a  7  for  a 
everywhere ,  and  attempt  to  resolve  all  other  constraints. 

7.  If  attempt  fails ,  backtrack  and  try  another  7.  If  no  more  7  we  fail . 

8.  Do  the  same  for  all  remaining  type  constraints. 

The  subtyping  algorithm  works  by  essentially  trying  out  all  possibilities.  It  terminates  because 
the  subtype  hierarchy  is  finite,  and  because  we  require  the  least  upper  bound  and  greatest  lower 
bound  of  a  set  of  types  to  actually  be  present  in  the  subtype  hierarchy.  That  is,  if  we  needed 
to  find  something  satisfying  true  <  a  and  false  <  a  and  there  were  no  supertype  of  both  true 
and  false ,  we  would  fail,  rather  than  return  the  type  \J  [true,  false].  If  the  algorithm  produces 
a  substitution,  then  it  is  guaranteed  to  be  correct  because  we  terminate  successfully  only  if  all 
constraints  are  satisfied.  Completeness  of  the  algorithm  follows  from  the  fact  that  we  are  trying 
out  all  possibilities  from  the  subtype  hierarchy. 

Example  5.8.2  Suppose  our  constraints  are: 

a  >  false 
a  <  /3 
(3  >  true 
(3  <  bool. 

Then  step  4  of  the  algorithm  will  have  us  proceed  with  the  constraints  on  (3  first .  There  is  nothing  to 
unify  in  the  right  hand  sides y  so  we  have  true  <  (3  <  bool.  We  make  the  substitution  (3  i-»  true  and 
attempt  to  solve  for  a ,  which  fails  in  step  6  of  the  algorithm ,  because  false  true.  We  backtrack 
and  try  the  only  remaining  choice  for  (3,  i.e.,  (3  bool.  This  succeeds ,  with  a  i-»  bool  as  well. 

5.9  Type  inference  with  polymorphism  and  overloading 

We  add  rules  for  generalization  and  instantiation,  and  note  that  we  only  apply  generalization  when 
typing  a  polymorphic  algorithm,  and  we  only  apply  instantiation  when  trying  to  subtype  with  type 
variables.  We  also  need  to  add  rules  about  overloaded  types.  Figure  5.5  contains  the  all  the  new 
rules. 

There  are  only  slight  modifications  to  make  our  algorithm  for  monomorphic  type  inference  work 
in  the  more  general  setting. 

Algorithm  5.9.1  (Polymorphic  type  inference)  Given  an  expression  e,  Av(e)  is  defined  as 
follows: 

1.  If  e  is  a  ground  state ,  x,  we  match  it  to  all  dcds  in  the  subtype  hierarchy ,  and  construct  its 
minimal  type ,  f\[o\..on). 

2.  If  e  is  an  algorithm  a ,  or  equivalently  a  higher-order  state  x ,  we  collect  all  input  cells  and 
values ,  Xin,  and  all  output  cells  and  values ,  xout  and: 
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(Gen) 

(Inst) 

(App-Over) 
(Comp-Over) 
(Fix- Over) 
(Curry- Over) 
(Uncurry-Over) 


e  :  a 
e  :  Va.  a 

e  :  Va.  a 
e  :  [-r/ajcr 

a  :  {<7i  — »■ Ti  |  i  G  l..n}  b  :  a 


a.b  : 

V[n  |  a 

<  CTj] 

a:  {n^Si  | 

i  E  l..n 

}  6  :  a  ->  t' 

a\b  :  a 

"►V[  1 

VI 

a  :  {a* 

— |  i 

€  l..n} 

fix  (a) 

■  Vfo  1 

>  Tj] 

a  :  {(cTj  x 

°[)  n 

|  i  G  l..n} 

curry  (a)  :  {&{  ->  g\  — ) 

•Tj  i  G  l..n} 

a:  {crj 

cr'i  -*■  n 

|  i  G  l..n} 

uncurry(a)  :  {(<jj  x  <r()  — >  Tj  |  i  G  l..n} 


Figure  5.5:  Type  inference  rules  for  polymorphism  and  overloading 

( a)  If  e  is  fully  polymorphic,  figure  out  matches  between  input  and  output  cell  names  and 
values  and  generate  polymorphic  type.  Generalize  it. 

(b)  If  e  is  overloaded,  figure  out  matches  between  input  and  output  and  generate  overloaded 
type  {era  ->ri  ..  on  ^  Tn } . 

(c)  If  e  is  monomorphic,  find  minimal  type  for  input  f\[o\..an],  and  output  A[TT-rm]  and 
generate  canonical  type , 

Akl  ••  01  ~->Tm  &n  “*Ti  &n-^Tm] 

3.  If  e  is  a  combinator  expression ,  we  apply  the  inference  rules . 

Theorem  5.9.2  (Soundness  of  polymorphic  type  inference)  If  Ap(e)  =  r  then  e  :  r. 

Proof:  By  induction  on  the  structure  of  e.  We  only  discuss  some  of  the  new  rules,  as  the  old  ones 
carry  through  unchanged. 

1.  e  ~  a  or  e  =  x  higher-order.  When  e  is  a  an  algorithm,  or  equivalently,  a  higher-order  state, 
there  are  several  possibilities.  If  e  contains  generic  cell  and  value  references  (of  the  kind 
$c,  $v),  we  generate  a  polymorphic  type  and  apply  Gen.  Since  the  references  are  generic,  it 
is  the  case  that  for  any  instantiation,  e  will  have  that  type.  If  e  is  overloaded,  we  collect  all 
matching  dcds,  and  generate  an  overloaded  type  {oi  \  i  E  l..n}.  It  is  the  case  that  for 
each  i  there  exists  an  instantiation  of  the  cells  and  values  in  e  such  that  e  :  Gi  — >  t*. 

2.  e  =  a.b ,  with  a  :  \  i  E  l..n}  and  b  :  g.  When  a  <  Gi  it  means  that  there  exists 

one  particular  instantiation  of  a,  such  that  a.b  :  r*.  By  collecting  all  such  i  and  computing 
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S  =  V[7z]  (assuming  it  exists),  we  are  ensuring  that  regardless  of  which  instantiation  is  used, 
a.b  :  6. 

3.  The  other  cases  are  handled  similarly  to  application. 


□ 

Example  5.9.3  We  have  already  explained  how  to  obtain  a  polymorphic  type  for  the  first  projec¬ 
tion ,  fst ,  from  Figure  5.j.  We  now  illustrate  step  2a  of  the  algorithm  by  showing  how  to  generate 
a  type  for  the  conditional  algorithm ,  also  found  in  Figure  5.4- 

The  input  cells  and  values  of  cond  are  ((2?.l).l)  =  tt,ff,  (($(7.2). 1)  =  §V,  and  ($(7.2)  =  $W. 
The  output  is  $(7  =  $V,$W.  Because  of  the  product  tags ,  the  type  generated  for  the  input  is 
( bool  x  a)  x  f3.  The  output  gets  type  7.  In  the  matching  stage ,  we  observe  that  the  input  from  the 
second  and  third  component  of  the  product  is  fully  polymorphic  and  matches  the  output,  therefore 
we  make  the  substitutions  a  7  and  /3  1 — >  7.  The  final  type  is  then  ( bool  x  a)  x  a. 


Chapter  6 

Refinement  Type  Inference 


In  this  chapter,  we  bring  together  several  disparate  ideas  into  a  novel  application  of  sequential 
algorithms,  and  an  example  of  the  practical  utility  of  intensional  semantics.  In  the  introduction,  we 
asked  the  question  of  how  to  take  advantage  of  the  intensional  information  present  in  an  intensional 
semantics  for  the  purpose  of  program  analysis.  We  provide  an  answer  to  that  question  here,  for  the 
semantics  of  sequential  algorithms,  by  using  CDSO  to  perform  refinement  type  inference  for  lazy, 
functional  programming  languages.  The  techniques  we  use  to  achieve  this  are  completely  different 
from  those  used  by  Freeman  and  Pfenning,  and  described  in  Chapter  2;  in  fact,  they  are  more 
closely  related  to  the  work  of  Hughes  and  Ferguson  on  abstract  interpretation  using  sequential 
algorithms,  also  presented  in  the  same  chapter. 

In  our  system,  the  user  may  choose  to  refine  a  type,  by  defining  finitely  many  refinements  of 
that  type.  Any  type  may  be  refined,  and  the  user  need  not  explicitly  state  which  types  refine 
which  type;  this  is  automatically  inferred  by  the  system.  A  type  and  its  refinements  can  always  be 
distinguished  by  examining  a  finite  number  of  cells,  which  we  shall  call  relevant  cells  from  the  point 
of  view  of  refinement  type  inference.  This  is  due  to  the  fact  that  only  finitely  many  refinements  of 
any  type  can  be  defined. 

When  presented  with  a  CDSO  algorithm  not  a  combinator  expression),  it  turns  out  to  be 
quite  easy  to  generate  a  refinement  type  by  examining  its  definition,  and  tracing  out  all  possible 
execution  paths,  collecting  information  about  which  inputs  lead  to  which  outputs.  This  information 
about  the  dependence  of  parts  of  the  output  on  parts  of  the  input  is  used  to  generate  the  refinement 
type. 

When  presented  with  a  CDSO  combinator  expression,  we  first  obtain  a  regular  type  for  it 
using  the  framework  described  in  the  previous  chapter.  We  then  use  the  regular  type  to  generate 
the  appropriate  relevant  cells  and  we  enter  an  interactive  questions  and  answers  session  with  the 
expression,  asking  for  its  value  at  the  relevant  cells.  The  result  of  the  questions  and  answers  session 
is  a  state,  which  is  a  small  approximation  of  the  expression.  We  obtain  a  refinement  type  for  the 
state  using  the  same  techniques  as  for  a  CDSO  algorithm.  Note  that  we  do  not  have  refinement 
type  inference  rules;  instead  we  perform  abstract  interpretation  on  the  expression  directly. 

We  begin  with  some  introductory  examples  of  how  to  obtain  refinement  types  for  algorithms 
in  Section  6.1.  We  then  formalize  the  notion  of  refinement  type  in  Section  6.2,  and  present  an 
algorithm  for  obtaining  refinement  types  for  CDSO  algorithms  (or  states)  in  Section  6.3.  To  give 
our  results  a  wider  applicability,  we  introduce  a  simple  lazy  functional  language  in  Section  6.4, 
which  we  also  call,  par  abus  de  langage ,  PCF,  and  show  how  to  compile  it  to  CDSO.  Then  we 
describe  how  to  perform  abstract  interpretation  on  CDSO  expressions  in  Section  6.5.  Finally,  in 
Section  6.6,  we  show  how  to  obtain  a  refinement  type  for  an  expression,  by  performing  abstract 
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interpretation  at  the  relevant  cells.  We  prove  soundness  for  our  refinement  type  inference  system. 


6.1  Introductory  examples 

We  begin  by  introducing  some  algorithms  on  integer  lists.  These  algorithms  will  not  only  be 
used  to  demonstrate  how  to  perform  refinement  type  inference,  but  will  also  serve  as  a  categorical 
combinator  compilation  target  for  the  list  functions  of  our  higher-level  language.  Figure  6.1  lists 
algorithms  for  null,  head,  tail,  and  cons.  Given  our  earlier  dcds  definitions  for  bool ,  int ,  and  intlist , 
the  algorithms  will  have  the  expected  regular  types: 

null  :  intlist-^  bool, 

hd  :  intlist— Hnt, 

tl  :  intlist  ->  intlist , 

cons  :  ( int  x  intlist)  intlist. 

These  types  will  be  obtained  by  our  type  inference  system  from  the  previous  chapter,  simply  by 
collecting  all  input  and  output  cells  and  values,  and  matching  them  to  the  dcds’s  in  the  subtype 
hierarchy. 

The  algorithms  for  null  and  head  are  very  simple.  Null  just  needs  to  check  if  the  only  initial 
cell,  EMPTY ,  is  filled  with  true  or  false.  Hd  first  has  to  find  EMPTY  =  false  in  order  to  be 
allowed  to  ask  for  the  value  of  cell  N.l,  which  it  copies  over  to  the  output. 

Tl  has  two  kinds  of  cells  in  the  output:  cells  of  the  form  EMPTY.l.l . . .,  with  zero  or  more  l 
tags,  and  cells  of  the  form  N.l.l . . with  one  or  more  l  tags.  Just  as  in  the  case  of  the  algorithms 
on  lazy  natural  numbers  from  Chapter  3,  we  use  a  variable,  $T,  to  stand  for  zero  or  more  tags.  The 
general  form  of  the  cell  names  will  then  be  EMPTY.%T  and  N.%T.l.  When  trying  to  fill  an  output 
cell  of  the  form  EMPTY.%T  we  need  to  copy  over  the  contents  of  the  input  cell  EMPTY.%T.l. 
Before  asking  for  the  value  of  that  cell,  we  must  list  in  a  from  construct  an  input  state  that  enables 
our  output  cell.  Similarly  for  the  output  cells  of  the  form  N.ST.l :  we  first  list  an  input  state  that 
enables  those  output  cells  in  a  from  construct,  then  copy  over  the  contents  of  input  cell  N.$T.l.l. 

Despite  its  longer  length,  cons  is  actually  simpler  than  tl,  with  a  similar  structure.  In  this  case 
our  output  cells  have  the  general  form  EMPTY.%T.l.l  and  N.%T.l.l,  which  does  not  include  the 
first  three  cells  of  intlist ,  so  we  list  them  separately,  as  base  cases.  The  trees  which  compute  the 
values  of  N.l  and  EMPTY .1  do  not  need  a  from  construct,  because  the  cell  EMPTY  is  guaranteed 
to  be  filled  with  false  in  the  output,  regardless  of  the  input. 

Now  we  make  the  simple  observation  that  instead  of  collecting  all  input  cells  and  their  values 
and  all  output  cells  and  their  values  and  matching  that  to  various  dcds’s  to  obtain  a  type,  a  process 
which  essentially  flattens  the  structure  of  a  sequential  algorithm,  we  could  take  advantage  of  the  tree 
structure  by  just  collecting  input  and  output  cell  and  value  information  for  each  path  through  the 
algorithm.  A  path  is  any  sequence  of  from  and  valof  statements  that  ends  in  an  output  statement. 
We  collect  this  information  for  our  integer  list  algorithms  in  Table  6.1. 

Before  going  further,  let  us  assume  that  aside  from  the  dcds’s  bool,  int,  and  intlist ,  we  have 
also  defined  the  previously  encountered  refinements  of  bool  and  intlist:  true ,  false ,  empty  Intlist, 
oneJntlist ,  and  many  Intlist.  The  question  that  arises  is:  What  happens  if,  for  each  algorithm, 
we  assign  a  type  to  each  of  its  lines  in  Table  6.1?  The  answer  turns  out  to  be  that  we  would  get 
the  refinement  type  for  the  algorithm. 

Let  us  denote  a  refinement  typing  judgment  by  the  notation  “  :r  If  algorithm  a  has  refinement 
type  r,  we  will  write  this  a  :r  r.  Further,  let  us  abbreviate  the  names  of  the  refinements  of  intlist 
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let  null  =  algo 
request  B  do 
valof  EMPTY  is 
true  :  output  tt 
false  :  output  ff 
end 
end 
end; 

let  hd  =  algo 
request  N  do 
valof  EMPTY  is 

false  :  valof  (N.l)  is 

$V  :  output  $V 
end 

end 

end 

end; 

let  tl  =  algo 

request  (EMPTY. $T)  do 

from  {(EMPTY. $T)=false>  do 
valof  ( (EMPTY. $T) .1)  is 
$B  :  output  $B 
end 
end 
end 

request  ((N.$T).l)  do 
from  { (EMPTY. $T)=false, 

( (EMPTY. $T).l)=false}  do 
valof  (((N.$T) .1) .1)  is 
$V  :  output  $V 
end 
end 
end 
end; 


let  cons  =  algo 
request  EMPTY  do 
output  false 
end 

request  (N.l)  do 
valof  (N.l)  is 
$V  :  output  $V 
end 
end 

request  (EMPTY. 1)  do 
valof  (EMPTY. 2)  is 
$B  :  output  $B 
end 
end 

request  (( (EMPTY. $T) .1) .1)  do 
from  {( (EMPTY. $T) .2)=false>  do 
valof  ( ( (EMPTY. $T) .1) .2)  is 
$B  :  output  $B 
end 
end 
end 

request  (((N.$T) .1) .1)  do 

from  {((EMPTY. $T) .2)=false>  do 
valof  (((N.$T) .1) .2)  is 
$V  :  output  $V 
end 
end 
end 
end; 


Figure  6.1:  Algorithms  on  integer  lists 
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null 

input 

output 

EMPTY  =  true 

EMPTY  =  false 

B  =  tt 

B  =  ff 

hd 

input 

output 

EMPTY  =  false,  N.l  =  $V 

N  =  $V 

tl 

input 

output 

EMPTY. $T  =  false,  EMPTY. $T.l  =  $B 

EMPTY. $T  =  false,  EMPTY. $T. 1.1  =  false,  N.$T.1.1  =  $V 

EMPTY. $T  =  $B 
N.$T. 1  =  $V 

cons 

input 

output 

0 

N.l  =  $V 

EMPTY. 2  =  $B 

EMPTY. $T. 2  =  false,  EMPTY. $T. 1.2  =  $B 
EMPTY. $T. 2  =  false,  N.$T.1.2  =  $V 

EMPTY  =  false 

N.l  =  $V 

EMPTY. 1  =  $B 

EMPTY. $T. 1.1  =  $B 

N . $T . 1 . 1  =  $V 

Table  6.1:  Input  and  output  dependence  for  integer  list  algorithms 


to  empty ,  one,  and  many .  Then,  we  would  expect  to  be  able  to  deduce  the  following  refinement 
typing  judgements  from  Table  6.1: 

null  :r  f\[empty  -» true,  one  — >  false ,  many  — >•  false] 
hd  :r  /\[one  int ,  many  — >  mt] 

:r  /\[one  — >  empty ,  many  — > 

cons  :r  A[(*ni  x  empty )  — >  one,  (ini  x  one)  — >  many ,  (int  x  many)  — ^  many] 

In  the  rest  of  this  section,  we  shall  attempt  to  make  the  inference  of  such  refinement  types 
plausible,  by  looking  at  how  to  do  it  for  the  algorithms  null ,  hd ,  and  tl  in  more  detail.  We  defer  a 
detailed  discussion  of  cons  to  Section  6.3,  where  we  introduce  our  actual  algorithm  for  performing 
refinement  type  inference  for  algorithms. 

The  procedure  is  quite  simple  for  null  and  hd.  Recall  from  our  dcds  definitions  for  empty, 
one,  and  many  in  Section  5.2,  that  the  event  EMPTY  =  true  could  only  come  from  dcds  empty. 
So  the  first  line  in  the  table  for  null  generates  the  refinement  type  empty  — ^  true.  The  event 
EMPTY  =  false  could  come  from  either  one  or  many.  So  the  second  line  in  nulV s  table  generates 
the  “raw”  type: 

!\[one,  many]  — »  false. 

By  distributing  the  meet  over  the  arrow  (just  as  we  did  last  chapter  in  constructing  the  canonical 
type  for  an  algorithm),  and  combining  the  result  with  the  type  of  the  first  line,  we  arrive  at  nulV s 
final  refinement  type: 

null  :r  )\[empty  — >  true ,  one  — »  false ,  many  — >  false]. 

The  only  line  in  the  input/output  dependence  table  for  hd  gives  us  the  following  raw  type: 

/\[one,  many]  — >  int. 
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Again,  we  distribute  the  meet  over  the  arrow  to  obtain  the  refinement  type: 
hd  :r  /\[one  int ,  many  — » int]. 

The  procedure  is  much  more  interesting  in  the  case  of  £Z,  because  we  end  up  with  meets  on  both 
sides  of  the  arrow  in  the  raw  type,  and  we  have  to  introduce  a  dependency  checking  phase,  which 
attempts  to  eliminate  the  meets. 

The  first  line  of  the  table  for  tl  gives  us  the  following  raw  type: 

A  [one,  many]  — >  A  [empty,  one ,  many]. 

We  cannot  simply  distribute  the  meets  in  this  case,  because  cell  and  value  variables  are  involved, 
and  it  is  quite  possible  that  certain  instantiations  of  those  variables  will  preclude  some  of  the  types 
generated  by  a  naive  distribution  of  the  meets.  What  we  have  to  do  then  is  to  consider  each 
case  in  which  the  input  comes  from  one  of  the  types  in  f\[one,many],  which  will  give  us  certain 
variable  instantiations,  and  then  see  what  happens  to  the  type  of  the  output  when  we  use  those 
instantiations. 

In  our  particular  example,  this  works  as  follows:  The  input  can  be  in  either  one,  or  many . 
Suppose  it  is  in  one.  That  means  that  $T  gets  bound  to  the  empty  tag,  and  $B  true.  This 
implies  that  the  output  event  is  EMPTY  —  true ,  which  can  only  come  from  empty.  So  we  have 
“proven”  a  dependence  between  input  one  and  output  empty ,  therefore  the  type  one  empty  will 
be  one  of  the  members  of  tV s  refinement  type. 

Now  suppose  the  input  comes  from  many.  In  this  case,  there  are  infinitely  many  possible 
instantiations  of  the  variables  $T  and  $B.  We  cannot  check  all  of  them,  but  we  can  unroll  many 
enough  times  so  that  a  “relevant”  portion  is  exposed.  This  will  be  made  precise  later.  For  now, 
let  us  suppose  we  have  unrolled  it  as  in  Section  5.2.  For  each  cell  name  and  value  list  in  the 
resultant  cva  list,  we  match  against  the  events  in  the  input  part  of  the  first  line  of  tV s  input/output 
dependence  table.  Nothing  interesting  happens  until  we  come  to  the  following  line  from  the  cva 
list: 

((EMPTY. 1) .1)  values  true,  false  access  (EMPTY. l)=false 

When  we  match  this  against  EMPTY.%T.l  =  $£  from  the  input,  we  get  $T  l  and  $S  »-» 
true ,  false.  Applying  this  substitution  to  the  output  variables,  we  get  EMPTY.l  with  values 
true ,  false.  It  turns  out  that  this  can  only  come  from  intlist ,  and  hence  we  obtain  another  piece  of 
the  refinement  type:  many  — >  intlist.  In  retrospect,  this  should  not  prove  surprising,  since  we  are 
not  distinguishing  between  two  and  three-element  lists:  When  presented  with  an  input  from  many, 
tl  can  either  return  something  in  one,  or  something  in  many ,  so  we  must  lose  some  precision  and 
give  the  resultant  type  of  intlist . 

The  second  line  of  the  table  for  tl  gives  us  the  following  raw  type: 
many  — >  A  [one,  many]. 

This  merely  confirms  what  we  already  knew:  There  are  instantiations  of  the  variables  such  that 
the  output  is  in  one  and  others  such  that  the  output  is  in  many ,  therefore  we  must  take  the  union 
of  one  and  many  as  the  type  of  the  result.  We  then  obtain  the  following  refinement  type  for  tl: 

tl  :r  f\[one  — >  empty ,  many  intlist]. 

A  natural  question  that  arises  after  reading  our  informal  description  of  the  type  inference 
algorithm  is:  What  happens  when  one  cannot  establish  a  dependence  between  input  and  output 
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(Ref-Refl) 

T  C  T 

cr<*T 

(Ref- Sub) 

— V 

a  □  r 

(Ref-And) 

AK..crn]  C  A[rl--rm 

(Ref- Arrow) 

G\  C  02  T\  [I  T2 
Ox  -»Ti  E  er2->-T2 

(Ref- Prod) 

o i  E  02  tl  E  F2 
d\  x  T\  E  cr2  x  T2 

Figure  6.2:  Definition  of  refinement 


types?  The  answer  is  that  one  can  always  establish  a  dependence  when  variables  are  involved.  This 
is  due  to  the  fact  that  we  are  only  dealing  with  CDSO  algorithms  or  states  and  not  combinator 
expressions,  hence  we  do  not  have  recursion  (we  will  show  how  to  handle  recursion  when  we  discuss 
refinement  type  inference  for  combinator  expressions).  It  will  always  be  possible  to  determine,  for 
a  certain  instantiations  of  variables  in  the  input,  what  happens  to  the  output.  The  only  time  this 
will  not  be  possible  is  when  the  input  and  output  are  actually  states  that  belong  to  the  intersection 
of  several  dcds’s.  For  example,  the  event  EMPTY  =  false  belongs  to  both  one  and  many .  In 
such  cases  it  is  correct  to  distribute  the  meet. 


6.2  Refinement  types 

Before  presenting  in  more  detail  the  algorithm  for  assigning  a  refinement  type  to  a  sequential 
algorithm,  we  shall  be  more  precise  about  what  we  mean  by  refinement  types  and  refinement 
typing  judgments. 

Intuitively,  as  described  in  the  previous  chapter,  where  we  set  the  foundations  for  our  typing 
system,  a  refinement  a  of  a  type  r  is  a  subtype  by  partition,  i.e.,  such  that  o  <pr.  In  general,  we 
will  allow  a  chain  of  subtypes  by  partition,  cr<*r.  Of  particular  interest  will  be  the  cases  when  a 
type  is  completely  partitioned,  as  we  will  not  be  able  to  get  meaningful  refinement  types  otherwise. 

For  the  non-ground  types,  we  present  type  inference  rules  for  determining  when  a  type  is  a 
refinement  of  another. 

Definition  6.2.1  (Refinement)  We  say  that  o  C  r  (a  refines  r)  if  it  can  be  deduced  from  the 
inference  rules  in  Figure  6.2 . 

Note  that  Ref-Sub  actually  implies  Ref-Refl  since  it  is  always  the  case  that  r  <p  r.  We  list 
Ref-Refl  separately  for  clarity.  The  rule  Ref- Arrow  may  seem  a  little  strange,  because  it  looks 
like  a  covariant  subtyping  rule,  but  it  means  something  else:  it  says  that  both  the  input  and  output 
refinement  types  should  be  refinements  of  the  respective  regular  types.  For  instance,  we  want  to 
have  true  false  C  bool  — ►  bool  (think  of  the  case  of  boolean  negation,  from  the  first  chapter). 

The  rule  Ref-Prod  is  not  surprising.  It  codifies  what  we  would  expect.  For  example,  it  should 
be  the  case  that  true  x  false  C  bool  x  bool . 
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The  rule  Ref-And  says  that  each  member  of  the  left  hand  side  meet  must  be  a  refinement  of 
some  member  of  the  right  hand  side  meet.  Most  useful  to  us  will  be  the  case  when  the  right  hand 
side  is  not  a  meet,  in  which  case  the  rule  says  that  all  types  on  the  left  hand  side  must  refine  the 
right  hand  side.  For  example, 

/\[empty  -~¥  true ,  one  false ,  many  false]  C  intlist  -¥  bool. 

Also  note  that  the  above  definition  implies  that  a  refinement  type  has  the  same  shape  as  the 
regular  type  that  it  refines. 

The  meaning  of  a  refinement  typing  judgment  for  ground  states  is  the  same  as  for  regular  typing 
judgments.  However,  for  algorithms  things  are  different,  because  an  algorithm  will  not  belong  as 
a  state  to  a  subtype  of  the  refinement  type.  Rather,  the  refinement  type  is  the  type  of  just  one 
“slice”  of  the  algorithm.  Consider  the  algorithm  null.  It  is  the  case  that  null  G  D (intlist  — ►  bool) 
but  null  D  (empty true).  Instead,  in  view  of  the  actual  origin  of  the  refinement  types  as  types 
of  paths  through  an  algorithm,  the  meaning  of  a  refinement  type  should  be  of  the  form:  If  the 
input  has  a  certain  refinement  type,  then  the  output  has  a  certain  refinement  type.  Considering 
null  again,  if  an  input  x  has  refinement  type  empty  then  it  will  be  the  case  that  a.x  will  have 
refinement  type  true.  With  this  in  mind,  we  present  the  following  definition. 

Definition  6.2.2  (Meaning  of  refinement  typing  judgments)  Given  a  sequential  algorithm 
a  :  A I  i  €  l..n],  we  say  that  a  :r  /\[5j->(j  \  j  G  l..m]  if  f\[5j  (j  |  j  G  l..m]  C 
f\[ai  —>Ti  |  i  G  l..n]  and  if  for  any  j ,  given  x  :r  8j,  a.x  :r  ( j . 

As  a  sanity  check,  we  have  the  following  proposition  which  relates  regular  typing  judgments  to 
refinement  typing  judgments. 

Proposition  6.2.3  If  a  :  a~^r  then  a  :r  a^r. 

Proof:  By  induction  on  a  and  r.  In  the  base  case,  for  a  ground  type,  the  meaning  of  :  and  :r  is 
identical.  Now  suppose  it  is  the  case  that  if  x  :  a  then  x  :r  a  and  similarly  for  r.  But  according  to 
Proposition  5.4.4,  given  x  :  cr,  a.x  :  r.  This  establishes  our  result.  □ 

Adding  refinement  types  to  our  subtype  hierarchy  can  change  the  regular  types  of  algorithms 
in  ways  we  may  not  want.  For  example,  if  we  define  the  three  refinements  of  intlist ,  the  regular 
type  of  tl  becomes  many  -» intlist.  This  is  due  to  our  requirement  that  all  input  cells  and  values 
belong  to  the  same  dcds,  and  similarly  for  the  output.  So,  whereas  before  we  could  apply  tl  to 
a  one-element  list,  now  we  cannot.  The  solution  is  to  specify  to  the  type  inference  system  which 
types  should  only  be  used  as  refinements.  We  introduce  a  special  definition  for  this  purpose: 

refine  true,  false,  empty.intlist ,  one_intlist ,  many.intlist ; 

This  way,  the  refinement  types  will  only  be  used  for  refinement  type  inference.  Also,  this  implies 
that  we  will  have  two  subtype  hierarchies:  a  regular  one,  and  a  refinement  one,  which  is  an  extension 
of  the  regular  subtype  hierarchy. 

Note  that,  as  opposed  to  the  Freeman-Pfenning  approach,  we  are  not  telling  the  system  which 
types  are  the  refinements  of  a  particular  type,  but  simply  that  it  should  use  certain  types  only  for 
refinement  type  inference. 
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6.3  Refinement  type  inference  for  algorithms 

We  are  now  almost  ready  to  present  our  refinement  type  inference  algorithm.  There  is  one  additional 
consideration,  aside  from  the  issues  raised  in  previous  sections,  that  must  be  faced,  and  that  is 
what  to  do  when  we  have  type  variables  in  the  refinement  type  but  not  in  the  regular  type.  A 
simple  example  is  provided  by  the  left  conjunction  Zand,  defined  in  Figure  2.3.  The  input/output 
dependence  information  is  shown  below: 


land 

input 

output 

B.l  =  tt,  B.2  =  tt 
B.l  =  tt,  B.2  =  ff 

B.l  =  ff 

B  =  tt 

B  =  ff 

B  =  ff 

The  first  two  lines  of  the  table  give  the  types  true  x  true-*  true  and  true  x  false-*  false,  but 
the  third  line  results  in  false  x  a-*  false.  We  know  from  the  regular  type  of  this  algorithm, 
bool  x  bool  - *bool  that  the  type  variable  a  really  corresponds  to  bool.  The  question  is:  Should  we 
instantiate  it  to  bool  only,  or  also  to  its  refinements? 

Note  that  this  question  is  a  very  different  one  from  the  problem  of  instantiations  of  polymorphic 
refinement  type  variables  in  Freeman  and  Pfenning.  In  our  case  it  is  more  a  matter  of  how  to  report 
the  type  to  the  user;  such  types  will  not  be  used  with  refinement  type  inference  rules.  In  addition, 
by  Proposition  6.2.3,  any  choice  we  make  would  be  correct.  We  simply  want  the  type  returned  to 
the  user  to  give  an  accurate  impression  of  what  the  algorithm  can  do. 

In  our  particular  example  of  land  and,  in  general,  whenever  the  type  variable  matches  a  ground 
type,  we  can  instantiate  it  only  to  the  regular  type.  This  is  because  such  type  variables  can  only 
occur  in  the  input  (remember,  we  are  collecting  all  paths  that  end  in  an  output  statement,  so  we 
could  not  have  a  type  variable  in  the  refinement  type  output  that  does  not  correspond  to  a  type 
variable  as  well  in  the  regular  type  output).  So  the  type  we  would  report  for  land , 

/\[true  x  true  ~~*  true,  true  x  falser  false,  false  x  bool  — >  false], 

would  accurately  imply  that  we  can  apply  land  to  anything  below  bool,  in  its  right  input,  and  still 
get  false,  as  long  as  the  left  input  is  false. 

However,  if  the  type  variable  corresponds  to  a  higher-order  type  from  the  regular  type,  we  will 
instantiate  it  to  all  possible  refinements.  For  example,  if  a  bool  — >  bool,  our  instantiation  would 
be: 

f\[true  -*  true,  true  -*  false,  true  — *  bool,  false  -*  true, 

false  -*  false ,  false  — »  bool ,  bool  — >  true,  bool  -*  false,  bool  bool]. 

We  now  present  our  algorithm  for  refinement  type  inference. 

Algorithm  6.3.1  (Refinement  types  for  algorithms)  Given  a  :  l\[(Ji-*n],  to  obtain  the  re¬ 
finement  type  of  a  do  the  following: 

1.  Collect  dependence  information  for  each  path  through  the  algorithm. 

2.  For  each  path ,  find  the  refinement  type  for  the  output  and  for  its  respective  input. 

3.  If  there  are  type  variables  in  the  refinement  type ,  attempt  to  eliminate  them  by  matching 
against  regular  type.  If  type  variable  matches  a  ground  type ,  only  instantiate  it  to  regular 
type ,  else  instantiate  it  to  all  possible  refinements  of  higher-order  type. 
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4 <  If  both  of  the  input  and  output  types  in  the  previous  step  contain  meets  then  attempt  to 
eliminate  them  by  examining  dependence  information.  If  unsuccessful,  distribute  the  meet.  If 
input  is  higher-order,  do  not  distribute  the  left  hand  side  meet. 

5.  If  the  type  of  any  path  is  not  a  refinement  of  the  regular  type,  eliminate  it. 

6.  Eliminate  types  that  are  refined  by  other  types  (l.e.,  eliminate  less  specific  types),  unless  the 
less  specific  types  are  produced  by  the  dependence  examination  step  (step  4)- 

7.  If  the  dependence  information  implied  a  -*  r\  and  also  o—>T2  then  replace  these  types  with 

cr->  V[n,T2]. 

8.  Meet  together  the  types  of  all  remaining  paths. 

Each  of  the  steps  of  the  algorithm  is  executed  in  succession.  There  is  no  need  to  iterate. 
However,  the  algorithm  does  call  itself  recursively  in  step  2.  Since  the  size  of  the  input  for  each 
recursive  call  is  strictly  decreasing,  the  algorithm  is  guaranteed  to  terminate. 

As  an  example,  consider  the  operation  of  this  algorithm  on  cons ,  the  longest  of  the  integer  list 
algorithms.  Table  6.2  show  the  various  paths  through  cons  with  their  respective  types  after  step  2 
of  the  algorithm.  In  step  3  we  instantiate  the  type  variables  obtaining  (without  listing  duplicates, 
here  or  later): 

{int  x  intlist )  ->  f\[one ,  many], 

(int  x  f\[empty ,  one ,  many])  —>  f\[one ,  many], 

{int  x  /\ [one,  many])  — >  many. 

In  step  4,  we  attempt  to  find  dependencies  and  eliminate  meets,  and  can  do  so  in  the  second  line 
above.  We  obtain  the  following  types  after  step  4: 

/\[{int  x  intlist)  one ,  {int  x  intlist)  many] 

A  [{int  x  empty)  — >  one ,  {int  x  one)  — >  many ,  {int  x  many)  many] 

A  [{int  x  one)  — >  many,  {int  x  many)  — >  many] 

During  step  6  of  the  algorithm  we  get  rid  of  the  first  and  third  lines,  which  are  both  refined  by  the 
second  one,  and  we  obtain  the  final  refinement  type: 

cons  :r  A  [{int  x  empty)  — >  one ,  {int  x  one)  — >  many,  {int  x  many)  — >  many]. 

We  now  provide  more  detail  and  justification  for  the  various  steps  of  the  algorithm.  Step  4  is 
the  dependence  examination  step.  We  have  a  type  of  the  form 

and  we  attempt  to  find  matches  between  the  of  s  and  the  rfi s.  As  described  before,  we  will  let  the 
input  be  in  each  of  the  ofs  and  see  if  that  narrows  down  the  choice  of  rfis  for  the  output. 

We  need  to  decide  how  much  to  unroll  each  of  the  0{.  Suppose  the  line  of  the  dependence  table 
that  gave  rise  to  our  type  had  cells  with  maximum  length  l.  Further,  suppose  that  the  maximum 
depth  of  any  of  the  Oi,  Tj  is  d.  Then  we  would  unroll  each  of  the  types  d  +  l  times.  In  view  of 
the  results  of  the  previous  chapter,  this  would  guarantee  that  our  types  would  have  a  chance  to 
generate  any  appropriate  cell  in  the  dependence  table. 
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cons 

input 

output 

“raw”  type 

0 

EMPTY  =  false 

a  — >  A  [one,  many ] 

N.l  =  IV 

N.l  =  IV 

(int  x  o?)  ¥  A  [one,  many] 

EMPTY. 2  =  $B 

EMPTY. 1  =  |B 

(1 a  x  f\[empty,  one,  many])  ->  f\[one,  many] 

EMPTY. $T. 2  =  false, 
EMPTY. $T. 1.2  =  |B 

EMPTY. |T. 1.1  = 

|B 

(1 a  x  Alone,  many])  — ^  many 

EMPTY.IT. 2  =  false, 

N. $T. 1.2  =  IV 

N. IT. 1.1  =  IV 

(a  x  A  [one,  many])  — >  many 

Table  6.2:  First  two  steps  in  finding  refinement  type  for  cons. 


Step  5  eliminates  some  types  that  may  arise  due  to  subtyping  by  extension.  For  example,  if  the 
regular  type  of  an  algorithm  were  cPoint  — t  bool ,  it  is  possible  that  a  path  through  the  algorithm 
will  have  type  point  -» true.  We  will  not  accept  this  as  a  refinement,  because  it  does  not  involve 
subtyping  by  partition.  There  is  no  meaningful  way  in  which  a  point  can  be  considered  a  refinement 
of  a  cPoint. 

Step  6  eliminates  types  that  may  have  resulted  from  instantiation  of  type  variables,  which  are, 
therefore,  not  as  precise  as  types  where  we  have  established  dependence  between  certain  input  and 
output  types.  We  have  seen  an  example  of  this  in  the  operation  of  the  algorithm  on  cons. 

Step  7  takes  into  account  the  situations  when  we  must  lose  precision  in  the  refinement  type. 
If  it  is  the  case  that  our  dependence  information  implied  both  o—*T\  and  also  a  — >■  r2  then  we 
cannot  know  which  is  the  resultant  type  given  an  input  in  o  so  we  must  replace  these  types  with 
a  — >  V[n,  t2].  The  union  V[Ti)  T2]  is  guaranteed  to  exist,  because  both  t\, t2  are  refinements  of  the 
same  type  (which  we  ensured  in  step  5).  Step  7  does  not  apply  to  cases  when  we  could  not  establish 
dependencies  in  step  4  and  distributed  a  right  hand  side  meet.  As  described  before,  such  cases  arise 
in  the  absence  of  variables  in  the  cell  names  and  values,  when  a  state  belongs  to  several  dcds’s. 

Theorem  6.3.2  [Soundness  of  refinement  type  inference J  If,  according  to  Algorithm  6.3.1,  a  :T  r 
then,  indeed,  a  :r  r. 

Proof:  In  broad  outline,  Algorithm  6.3.1  works  by  assigning  a  refinement  type  to  each  input  and 
each  output  in  all  paths  through  the  algorithm.  This  certainly  leads  to  sound  refinement  types, 
according  to  our  definition  of  the  meaning  of  refinement  typing  judgments.  The  only  problems 
might  be  caused  by  the  modifications  we  make  to  the  refinement  types  along  the  way,  such  as 
eliminating  meets,  instantiating  type  variables,  and  taking  unions.  We  discuss  each  of  these  in 
turn. 

We  have  already  discussed  the  instantiation  of  type  variables  from  step  3  of  the  algorithm. 
Because  of  Proposition  6.2.3,  it  is  sound  to  instantiate  regular  types.  Certainly  instantiating  re¬ 
finement  types  also  is  sound. 

The  elimination  of  meets  from  step  4  is  sound  because  we  will  either  prove  dependence  or  have 
a  state  that  belongs  to  several  dcds’s.  As  we  have  argued  before,  we  can  always  prove  dependence 
when  cell  and  values  variables  are  involved,  because  we  do  not  have  recursion  at  this  level.  As  can 
be  seen  from  the  CDSO  syntax  in  Appendix  C.l,  the  constraints  which  can  be  placed  on  variables 
are  very  simple,  and  easily  decidable. 

Finally,  when  we  take  unions  in  step  7,  it  still  true  that  given  x  :r  a,  a.x  :r  V[Ti>  r2]i  because 
T\  <  V[ti,t2],  and  same  for  r2. 
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In  conclusion,  each  path  will  have  a  refinement  type  ct  — ^  r  such  that,  given  x  :r  a ,  a.x  :r  r.  Also, 
we  explicitly  eliminate  any  types  which  are  not  refinements  of  the  regular  type  in  step  5.  Therefore, 
our  algorithm  will  lead  to  a  refinement  of  the  original  type.  □ 


6.4  A  higher-level  language 

As  we  have  already  explained  when  we  introduced  CDSO  in  Section  2.3,  the  language  was  meant  to 
be  a  compilation  target  from  the  beginning.  The  examples  presented  so  far  make  it  quite  clear  that 
one  would  not  want  to  be  programming  directly  in  CDSO,  unless  one  wanted  to  write  programs 
which  take  advantage  of  its  intensional  features,  and  which  cannot  be  written  in  an  extensional 
language.  For  most  purposes,  one  would  want  to  use  a  higher-level  language.  We  introduce  such  a 
language  in  this  section,  and  show  how  to  compile  it  to  CDSO. 

6.4:1  PCF 

The  higher-level  language  is  a  lazy,  higher-order,  polymorphic,  functional  language.  We  call  it  PCF 
for  historical  reasons:  the  original  CDSO  interpreter  of  Devin  [30]  also  had  a  PCF  interpreter,  which 
actually  corresponded  to  the  original  PCF  [70].  We  started  with  a  similar  language  but  added  more 
features,  until  we  arrived  at  a  full-featured  functional  language.  We  have  kept  the  name. 

A  slightly  simplified  grammar  for  the  language  is  given  in  Figure  6.3.  The  full  version  can  be 
found  in  Appendix  C.2.  The  language  is  typed  in  the  usual  way. 

The  syntax  is  somewhat  similar  to  that  of  Standard  ML  of  New  Jersey:  the  binding  of  identifiers 
to  expressions,  lambda  abstraction,  products,  and  list  functions  are  the  same.  The  first  and  second 
projections  are  slightly  different,  as  are  some  of  the  basic  operations.  The  main  difference  comes 
in  the  definition  of  recursive  functions,  where  we  have  sacrificed  some  ease  of  readability  for  ease 
of  compilation. 

The  programs  for  boolean  negation  and  for  map  from  the  introduction  are  examples  of  programs 
written  in  this  language. 

6.4.2  Compilation  to  CDSO 

The  compilation  to  CDSO  is  the  same  as  that  to  categorical  combinators  [26];  the  combinators  denote 
sequential  algorithms,  therefore,  an  entire  PCF  program  will  also  denote  a  sequential  algorithm. 

The  idea  of  the  compilation  is  to  have  variables  stored  in  an  environment,  and  have  a  PCF  ex¬ 
pression  denote  a  function  from  the  environment  to  its  value.  An  environment  containing  variables 
•  •  •  5  %n  is  implemented  as  nested  pairs  of  the  form  ((...({},  xn a?0),  where  {}  denotes  the 
empty  environment:  in  our  case  the  empty  CDSO  state. 

The  first  step  in  the  compilation  is  translation  of  the  PCF  expression  to  de  Bruijn  notation  [26], 
in  which  variables  are  replaced  by  natural  numbers.  The  de  Bruijn  terms  are  built  as  follows: 

1.  Any  natural  number  is  a  term, 

2.  If  M  and  N  are  terms,  then  MN  is  a  term, 

3.  If  M  is  a  term,  then  A.  M  is  a  term, 

4.  If  M  is  a  term,  then  fix  M  is  a  term,  and  so  on  for  all  other  PCF  built-in  functions. 
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( program )  : :  =  ( expr ) 

|  val  id  =  (expr) 

(expr)  : :  =  (const) 

|  id 

|  (expr)  (expr) 

|  fn  id  —  >  (expr) 

|  let  id  =  (expr)  in  (expr)  end 

|  letrec  id  =  (expr)  in  (expr)  end 

|  (expr)  (op)  (expr) 

|  if  (expr)  then  (expr)  else  (expr) 

*  I  ((expr) ,  (expr)) 

|  fst  ((expr)) 

|  snd  ((expr)) 

|  (expr)  : :  (expr) 

|  hd  (expr) 

|  tl  (expr) 

i  m 

|  null  (expr) 

I  ((expr)) 

(const)  : :  =  true  |  false  |  integers 

(op)  ::=  +  |  -  |  *  |  /  |  =  |  <  |  >  |  <=  |  >=  |  and  I  or 

Figure  6.3:  Syntax  of  higher-level  language 
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XDB(x o,. =  minC?  I  x  =  xj) 

(fn  x  =>  M)DB(XOj^Xn)  =  A.  MDB(x>Xo^Xn) 

{MN)DB(XOt"jXn')  —  DB(X0)"^Xn) 

{let  x  =  M  in  N) BB^Xo^  _tXn^  —  {{fn  x  =>  TV)  M) BB(XOt  _ ,)Xn) 

{letrec  x  =  M  in  N)DB{xo>,^Xn)  =  {{fn  x  =>  TV)  {fix  {fn  x  =>  M)))DB(x0r^Xn) 
(M,  N)DB(XOt ...)Xn)  =  {MDB^X0v4^Xn),NDB^X0^Xn'<)) 

{fst  M)DB(x o, . „,In)  =  /si  MDB(Xo^Xn) 


Figure  6.4:  Translation  to  de  Bruijn  notation 


The  translation  of  a  term  M  is  called  Mdb(Xq v t?a.n),  where  FV(M)  C  {a;0,  • . .  ?^n}?  and  it  works 
by  keeping  track  of  the  variables  already  encountered,  which  act  as  an  environment.  It  is  shown 
in  Figure  6.4.  We  have  listed  only  one  of  the  constants  of  PCF;  the  translation  for  the  others  is 
similar. 

Now  that  we  have  replaced  variables  with  natural  numbers,  we  will  use  the  numbers  as  indices 
into  the  environment.  For  instance,  variable  0  will  become  snd,  variable  1  will  be  mapped  to 
snd  |  fst  and  so  on.  A  variable  becomes  code  which  pulls  out  a  particular  location  from  the 
environment. 

The  translation  of  a  term  M  in  de  Bruijn  notation  to  categorical  combinators,  denoted  Mcc-> 
is  shown  in  Figure  6.5,  again  listing  only  some  of  the  constants  from  the  de  Bruijn  notation  as  the 
code  for  the  others  is  similar. 

The  translation  deserves  some  comment.  The  basic  constants  of  the  language,  i.e.,  true ,  /aZse, 
and  the  integers,  are  encoded  as  non-strict  algorithms  from  the  environment  to  states  of  bool  or 
int.  The  algorithm  curry(fst)  .{B  =  W},  for  instance,  has  type  Vce.  a  — ^  bool.  When  applied  to  an 
environment,  it  ignores  it  and  returns  a  state  of  bool. 

Lambda  abstraction  becomes  currying.  In  the  code  for  application,  uncurry(id)  is  the  CDSO 
application  algorithm.  When  taking  fixpoints,  we  need  a  version  of  the  fixpoint  combinator  which 
works  with  environments;  we  call  it  Yenv.  It  has  the  type  Va.  (env  — » a  — >  a)  — >  env  — >  a,  and  its 
implementation  is: 

let  Yenv  =  curry(Y  |  uncurry (id) ) ; 

Y  is  the  normal  fixpoint  combinator,  of  type  Va.  (a  — >  a)  — >  a,  which  we  implement  by  doing  a 
“manual”  translation  from  PCF  to  CDSO.  In  PCF  the  code  for  Y  is: 

fix  {fn  f  =>  fnx  =>  x  (/  «)), 

which  becomes  the  following  CDSO  code: 

let  Y  =  fix ((curry (curry (uncurry (id)  I 

<snd,  uncurry  (id)  I  <snd|fst,  snd»)))  .emptyenv)  ; 

We  need  to  define  a  CDSO  algorithm  for  every  built-in  function  of  PCF.  We  have  already 
defined  the  identity  (id),  first  projection  (/st),  conditional  (cond),  left  conjunction  Zand,  and  the 
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truecc  —  curry(fst)  .  {B  =  tt} 
falsecc  =  curry(fst)  .  {B  =  ff } 
ncc  =  curry (fst)  .  {N  =  n} 

XiCC  =  snd  I  fst1 

(A.  M)cc  =  cwrry(Mcc) 

(MN)cc  =  uncurry(id)  |  <  Mcc->Ncc  > 

(M,N)cc  =<  Mcc,Ncc  > 

(fix  M)cc  =  •  Afcc 

(/st  =  /«*  I  Alee 
(snd  M)cc  —  | 

(if  M  then  N\  else  N2)cc  —  cond  |  <<  Mec,  Nice  >5^2 cc  > 
(M  and  7V)cc  =  land  \  <  Mcc ?  Ncc  > 


Figure  6.5:  Translation  to  categorical  combinators 

list  algorithms.  We  give  a  sampling  of  the  others,  by  defining  addition  and  integer  equality  test. 
The  complete  list  of  CDSO  algorithms  which  are  used  to  compile  PCF  is  given  in  Appendix  B.3. 

The  algorithm  for  addition  has  type  int  x  int  int  and  it  works  by  using  variables  to  record 
the  values  of  the  inputs  and  adding  them: 

let  plus  =  algo 
request  N  do 

valof  (N.l)  is 

$V1 :  valof  (N . 2)  is 

$V2:  output  $V1  +  $V2 

end 

end 

end 

end; 

The  integer  equality  test,  of  type  int  x  int-*  bool,  works  in  similar  fashion  (the  notation  “!=” 
means  “not  equal  to”): 

let  equal  =  algo 
request  B  do 

valof  (N.l)  is 

$V1:  valof  (N . 2)  is 

$V2  with  $V2  =  $V1:  output  tt 
$V2  with  $V2  ! =  $V1 :  output  ff 

end 

end 

end 


end; 
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Given  a  PCF  program  M,  the  final  result  of  the  compilation  to  CDSO  is  Mqc  ■  {} ,  *"•  e.,  we  apply 
the  categorical  combinator  translation  of  the  program  to  the  empty  environment.  For  examples  of 
translations  of  PCF  programs,  we  turn  to  not  and  map  from  the  introduction.  We  instruct  our 
PCF  interpreter  to  print  out  the  code  for  the  two  programs: 

$  print  not; 

not  =  curry  (cond  |  <  <  snd,  curry  (fst)  .  B=ff  >, 

curry  (fst)  .  B=tt  >)  .  emptyenv 

$  print  map; 

map  =  uncurry  (id)  |  <  curry  (snd), 

Yenv  .  curry  (curry  (curry  (cond  |  <  <  null  I  snd,  curry  (fst)  .  nil  >, 
cons  |  <  uncurry  (id)  |  <  snd  |  fst,  hd  |  snd  >, 

uncurry  (id)  |  <  uncurry  (id)  |  <  snd  I  fst  I  fst,  snd  I  fst  >, 
tl  I  snd  >>>)))>.  emptyenv 


6.5  Abstract  interpretation 

When  performing  refinement  type  inference  on  CDSO  expressions  or  PCF  programs,  we  will  need 
to  ask  for  the  values  of  various  cells  in  the  expression.  This  section  describes  how  we  can  do  that 
without  looping,  and  without  always  having  to  resort  to  a  hard  bound  on  the  number  of  recursive 
iterations. 

6.5.1  Loop  detection 

The  Hughes  and  Ferguson  approach  to  loop  detection  for  sequential  algorithms,  which  we  described 
in  Section  2.5,  does  not  work  well  in  our  case  because  we  use  the  CDS02  operational  semantics 
(which  we  chose  for  overall  efficiency  over  CDS01,  as  explained  in  Section  2.3.7).  Our  cell  names 
can  incorporate  expressions,  and  grow  quite  large,  therefore  we  cannot  simply  check  for  equality 
of  cell  names  to  detect  when  a  cell  depends  on  itself.  It  is  possible  to  simplify  the  cell  names, 
and  collect  information  about  dependence  of  output  cells  on  input  cells,  as  Hughes  and  Ferguson 
do.  In  fact,  our  first  approach  had  this  form,  but  it  did  not  detect  many  loops,  and,  just  like  the 
Hughes  and  Ferguson  implementation,  was  extremely  space-inefficient.  This  forced  us  to  develop 
an  alternative  approach,  one  that  is  better  suited  to  CDS02. 

Recall  from  Section  2.3.7,  that  the  fixpoint  rule  has  the  form: 

(Fix’)  fix(A)  ?  c— >  A.fix(A)  ?  c 

In  the  process  of  computing  fix(A)  ?  c,  we  may  ask  the  questions  fix(A)  ?  ci,  fix(A)  ?  C2,  and 
so  on.  If  it  is  the  case  that,  for  some  index  i,  we  end  up  wanting  to  know  fix  (A)  ?  q,  with  =  c, 
then,  clearly,  this  is  a  looping  computation. 

The  problem  is  that  the  cell  names  q,  c,  may  contain  expressions  embedded  into  them,  so  we 
cannot  easily  check  for  equality.  What  we  shall  do  is  apply  a  stock  simplification  to  the  cell  names, 
without  trying  to  evaluate  the  expressions  inside,  and  check  for  syntactic  equality.  This  works  fairly 
well  for  PCF  programs,  and  does  not  usually  apply  to  CDSO  programs. 

The  simplification  we  perform  is  to  replace  cell  names  of  the  form  (snd.(xi, x^) )c  with  x^c. 
It  turns  out  that  when  recursions  get  unwound  in  PCF,  expressions  of  the  form  snd.(x  1,2:2)  get 
concatenated  to  cell  names.  This  is  due  to  the  actual  code  for  Y  from  the  previous  section,  which  is 
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the  only  possible  source  for  fixpoint  computations  in  PCF.  The  questions  we  ask  when  evaluating 
PCF  expressions  will  have  the  form  Y  ?  xc,  where  £  is  a  complex  expression. 

We  give  an  example  of  how  this  works  in  practice.  Consider  the  following  PCF  program: 

val  looplO  =  letrec  f  =  fn  b  =>  if  b  then  true  else  f  b  in  f  end; 

This  program  clearly  loops  when  presented  with  a  false  input.  The  first  two  unrollings  of  the 
fixpoint  computation  when  the  question  looplO  ? { B  =  JJ } B  is  asked,  will  lead  to  questions  of  the 
form  /  ?  ci  and  /  ?  C2  with  Y  =  fix  f,  and  the  cells  being: 

cl  =  uncurry  (id)  .  (curry  (curry  (cond  |  «snd,  curry  (fst)  .  {B=tt}>, 
uncurry  (id)  |  <snd  |  fst,  snd»)),  emptyenv) 

{B=ff }  B 

c2  =  snd  .  ((emptyenv,  Y) ,  uncurry  (id)  .  (curry  (curry  (cond  | 

«snd,  curry  (fst)  .  {B=tt}>,  uncurry  (id)  I  <snd  |  fst,  snd»)), 
emptyenv) ) 

snd  .  ((emptyenv,  (uncurry  (id)  I  <snd  |  fst,  snd>)  .  ((emptyenv,  Y) , 
uncurry  (id)  .  (curry  (curry  (cond  I  «snd,  curry  (fst)  .  {B=tt}>, 
uncurry  (id)  |  <snd  I  fst,  snd»)),  emptyenv))),  {B=ff}) 

B 

Applying  the  simplification  to  C2  shows  it  to  be  syntactically  equal  to  c\,  and  so  we  have  proved 
that  looplO  deserves  its  name. 

6.5.2  Depth-bounding 

We  cannot  detect  all  loops  in  the  manner  presented  above.  Even  fairly  simple  functions,  which 
loop  on  some  input  which  is  slightly  modified  and  then  modified  back  to  the  original  form,  cannot 
always  be  detected.  For  example,  the  following  PCF  function  loops: 

val  loop9  =  letrec  f  =  fn  1  =>  if  null  1  then  let  11  =  f  (1::1)  in  f  11  end 

else  f  (tl  1) 

in  f  end; 

This  cannot  be  detected,  because  we  do  not  simplify  expressions  of  the  form  tl  (x::l)  to  l.  We 
could  add  such  new  simplification  rules,  but  we  would,  of  course,  still  not  be  able  to  detect  all  loops 
because  it  is  undecidable. 

In  fact,  since  our  language  is  lazy,  we  have  to  contend  with  infinite  data  structures.  It  is  possible 
to  define,  for  instance,  an  infinite  list  of  ones: 

val  ones  =  Y  (letrec  f  =  fn  1  =>  1 : : (f  1)  in  f  end) ; 

If  we  had  a  length  function,  detecting  a  loop  in  length  ones  would  not  be  possible,  even  in  the 
Hughes  and  Ferguson  approach.  So  rather  than  add  more  simplification  rules,  we  add  a  bound  on 
the  number  of  recursive  iterations,  a  bound  already  made  necessary  by  the  presence  of  infinite  data 
structures. 

We  modify  the  operational  semantics  of  fixpoint  to  keep  track  of  how  many  times  it  has  been 
called  while  computing  a  value  for  the  same  cell.  If  that  reaches  a  certain  bound,  we  interrupt  the 
computation.  We  note  that,  in  practice,  the  bound  can  be  set  to  a  very  low  value  (for  instance, 
30),  since  if  an  expression  will  not  loop,  its  computation  will  unroll  to  a  shallow  depth.  This  is 
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dependent  on  the  dcds’s  defined  in  the  system,  and  in  particular  on  the  refinements  (as  will  be 
explained  in  the  next  section).  The  refinements  we  use  can  be  distinguished  by  examining  at  most 
three  cells,  which  usually  leads  to  short  computations. 


6.6  Refinement  type  inference  for  expressions 

Given  a  CDSO  combinator  expression,  or  equivalently,  a  PCF  program,  we  shall  perform  refinement 
type  inference  for  it  by  first  obtaining  its  regular  type,  seeing  if  the  regular  type  admits  any  possible 
refinements,  then  generating  an  initial  set  of  relevant  cells  with  which  to  query  our  expression.  In  the 
process  of  querying  the  expression,  we  will  have  uncovered  a  state,  which  is  a  small  approximation 
to  the  combinator  expression.  We  then  perform  refinement  type  inference  on  the  state  using  the 
techniques  of  Section  6.3. 

A  ground  type  will  admit  refinements  when  it  has  subtypes  by  partition  in  the  refinement 
subtype  hierarchy.  A  compound  type  will  not  admit  refinements  when  either: 

•  The  type  is  fully  polymorphic,  or 

•  No  components  of  the  type  admit  refinements. 

For  example,  if  int  has  no  refinements,  the  type  int  int  does  not  admit  refinements,  but  int  ->  bool 
does,  when  refinements  true  and  false  are  defined. 

6.6.1  Generating  relevant  cells 

A  given  type  and  its  refinements  can  always  be  distinguished  by  examination  of  a  finite  number  of 
cells.  This  is  due  to  the  fact  that  only  finitely  many  refinements  of  a  type  can  be  defined.  In  this 
section  we  discuss  how  to  automatically  generate  such  cells.  First  we  consider  ground  dcds’s. 

Recall  from  Definition  5.3.2,  that  a  subtype  by  partition  has  the  same  initial  cells  as  the  super¬ 
type,  but  some  may  have  fewer  values.  We  are  interested  in  exactly  those  cells.  In  particular,  we 
want  those  distinguishing  cells  which  do  not  have  an  infinite  number  of  possible  values.  After  we 
collect  such  initial  cells,  we  shall  look  at  all  cells  enabled  by  the  initial  cells,  and  collect  distinguish¬ 
ing  cells,  and  so  on.  We  stop  when  there  is  no  difference  between  the  supertype  and  the  subtype’s 
cells. 

Algorithm  6.6.1  (Generating  ground  relevant  cells)  Given  a  type  r  and  a  collection  of  its 
refinements,  Oi,  the  set  of  relevant  cells  is  generated  as  follows: 

1.  Find  maximum  depth  among  r,  G{,  and  unroll  each  dcds  to  that  depth. 

2 .  Collect  all  initial  cells  from  r  and  the  0{.  Let  us  call  such  sets  of  cells,  IT,  I(Ti. 

3 .  Compare  the  I(Ti  among  each  other:  if  the  same  cell  exists  in  two  of  the  IGi,  but  with  different 
values,  and  the  set  of  values  is  not  infinite,  add  it  to  the  list  of  relevant  cells. 

4.  Compare  the  IT  to  the  I0i  as  above. 

5 .  Generate  the  set  of  cells  reachable  from  the  initial  cells  for  r  and  the  <j{,  and  perform  the 
same  comparison  as  above.  When  adding  a  non-initial  relevant  cell,  add  also  the  cells  which 
may  enable  it. 
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6.  If  there  is  no  difference  between  the  sets  of  cells  for  two  Oi ,  or  a  C{  and  r  at  this  stage ,  but 
there  was  in  the  previous  generation  of  cells ,  add  all  cells  with  non-infinite  value  lists. 

7.  Continue  in  this  fashion ,  until  all  cells  are  the  same ,  or  no  more  cells  can  become  enabled. 


Example  6.6.2  As  an  example ,  we  apply  Algorithm  6.6.1  to  intlist  and  its  refinements ,  empty, 
one ,  and  many.  The  maximum  depth  among  the  four  dcds’s ,  as  we  have  seen  in  the  previous 
chapter ,  is  8  (the  depth  of  many).  The  sets  of  initial  cells  generated  from  unrolling  each  of  the  four 
dcds’s  8  times  are: 


J-intlist  —  lempty  —  lone  — 


many 


=  {EMPTY} 


Relevant  =  0 


Cell  EMPTY  exists  in  all  I  sets  and  it  does  have  a  different  set  of  values,  because  of  empty. 
Furthermore,  the  set  of  values  is  not  infinite.  Therefore  we  add  it  to  the  set  of  relevant  cells.  We 
generate  the  new  I  sets: 

1 intlist  ~  lone  —  Imany  =  {N!,  EMPTY!} 

Relevant  =  {EMPTY} 

Cell  N.l  has  the  same  set  of  (infinite)  values  in  each  of  the  I  sets,  so  we  discard  it.  However, 
EMPTY.l  again  has  different  (finite)  sets  of  values,  so  we  add  to  the  relevant  cells,  and  generate 
the  next  level  of  reachable  cells: 

lintlist  =  Imany  =  {N !!,  EMPTY!!} 

Relevant  =  {EMPTY,  EMPTY!} 

Cell  N.l!  is  not  suitable  again,  but  also  EMPTY .1.1  has  the  same  values  in  both  intlist  and  many. 
However,  since  the  previous  generation  of  cells  had  a  difference  (step  6),  we  add  EMPTY!!  to  the 
set  of  relevant  cells: 

Relevant  =  {EMPTY,  EMPTY!,  EMPTY!!} 


Using  Algorithm  6.6.1,  we  can  generate  relevant  cells  for  higher-order  types.  We  shall  never 
need  to  do  this  in  its  full  generality,  rather,  for  a  higher-order  type,  we  shall  need  to  generate  the 
initial  set  of  relevant  cells,  and  to  find  the  relevant  cells  enabled  by  a  state.  Both  these  tasks  are 
easy  to  accomplish  using  Definition  2.2.13  for  the  exponentiation  dcds. 


6.6.2  The  algorithm 

We  are  now  ready  to  present  our  algorithm  for  refinement  type  inference  for  expressions. 

Algorithm  6.6.3  (Refinement  types  for  expressions)  Given  an  expression  e,  do  the  follow¬ 
ing  to  find  its  refinement  type: 

1.  Find  regular  type  r  for  e. 

2.  If  r  does  not  admit  refinements,  return  it  as  the  refinement  type.  Otherwise  generate  initial 
set  Ir  of  relevant  cells  of  r. 

3.  For  each  cell  c  €  IT,  ask  the  question  e  ?  c. 
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4-  If  e  ?  c  — >  v,  then  add  the  event  (c,v)  to  an  approximation  x  of  e.  Find  out  what  relevant 

cells  are  enabled  by  {c,v)  and  add  them  to  IT. 

5.  If  e  ?  c  — >  valof  cf ,  then ,  if  d  is  a  relevant  cell ,  and  it  has  a  finite  set  of  possible  values, 

then  for  each  possible  value  vf,  add  {d  =  vf}c  to  IT. 

6.  If  el  c  — *  loop ,  then  go  on  to  the  next  cell  in  IT. 

7.  When  the  set  IT  is  exhausted,  apply  Algorithm  6.3.1  to  the  approximation  x. 

The  algorithm  always  terminates  because  we  are  using  the  techniques  of  the  previous  section 
to  evaluate  el  c,  and  because  there  are  finitely  many  relevant  cells. 

The  state  x  is  clearly  an  approximation  to  the  expression  e.  Since  we  apply  Algorithm  6.3.1  to 
x  to  generate  the  refinement  type,  an  algorithm  whose  soundness  we  have  proven  in  Theorem  6.3.2, 
it  follows  that  Algorithm  6.6.3  is  also  sound.  Hence, 

Theorem  6.6.4  [Soundness  of  refinement  type  inference  for  expressions]  If,  according  to  Algo¬ 
rithm  6.6.3,  e:rr  then ,  indeed,  e  :r  r. 


Chapter  7 

Implementation  and  Examples 


In  this  chapter,  we  present  an  overview  of  our  implementation  and  demonstrate  the  practical  utility 
of  our  approach  to  refinement  type  inference  with  many  examples.  Section  7.1  covers  the  imple¬ 
mentation,  also  pointing  out  differences  with  the  theory  of  the  previous  two  chapters.  Section  7.2 
contains  CDSO  examples,  and  Section  7.3  PCF  examples. 

7.1  Implementation 

Our  prototype  implementation  of  CDSO  with  type  inference  and  refinement  type  inference  is  in 
Standard  ML  of  New  Jersey,  version  0.93.  The  implementation  consists  of  about  15,000  lines  of 
code,  of  which  about  5,000  are  automatically  generated  by  YACC  and  LEX,  or  are  part  of  the 
YACC  base  environment.  The  reason  for  the  large  number  of  lines  of  YACC  and  LEX  code  is  that 
we  have  three  interpreters  as  part  of  the  system:  a  CDSO  interpreter,  a  PCF  interpreter,  and  a  cell 
name  interpreter,  for  accepting  user  input  in  the  request  loop. 

In  the  rest  of  this  section,  we  shall  attempt  to  give  an  idea  of  the  structure  of  the  CDSO 
interpreter,  and  also  describe  our  internal  representation  for  certain  notions  presented  in  previous 
chapters. 

7.1.1  Brief  overview 

Figure  7.1  shows  a  schematic  diagram  of  the  module  dependencies  in  our  implementation.  Underly¬ 
ing  the  whole  implementation  are  definitions  of  the  CDSO  parse  tree,  CDSO  internal  representations, 
CDSO  runtime  environment,  and  PCF  parse  tree.  The  Parser  module  is  a  conglomeration  of  the 
three  parsers  and  lexers  already  mentioned.  Printer  is  a  somewhat  pretty  printer,  and  Match  per¬ 
forms  the  binding  of  cell  and  value  variables  during  execution.  PcfCode  implements  the  translation 
to  categorical  combinators,  and  Internal  the  translation  from  CDSO  parse  trees  to  forests.  The 
translation  to  internal  type  representation  ( idcds )  is  in  Type,  which  also  implements  algorithms  for 
deciding  subtyping  for  ground  dcds’s.  Deciding  subtyping  for  type  expressions  is  done  in  Subtype , 
and  the  type  inference  of  regular  types  is  in  TypeChecker.  The  Evaluator  implements  the  CDS02 
operational  semantics.  QandA  is  the  questions  and  answers  module,  which  generates  relevant  cells, 
and  finds  a  relevant  approximation  to  an  expression.  The  Refine  module  puts  together  type  infer¬ 
ence  and  refinement  type  inference.  Finally,  Toplevel  takes  care  of  the  top  level  loop,  the  request 
loop,  and  of  error  reporting. 

Of  some  interest  to  the  reader  may  be  our  implementation  of  forests,  first  described  in  Sec¬ 
tion  2.3.6.  The  datatype  definitions  are  listed  in  Figure  7.2,  omitting  type  definition  of  tag ,  which 
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CDSBasic,  CDSIntemal,  CDSEnv,  PCFBasic 


Figure  7.1:  Module  dependencies  in  our  CDSO  implementation 


is  not  relevant  here.  Typical  of  all  our  CDSO  definitions,  the  types  of  the  internal  representations 
of  cells  ( icell ),  values  ( lvalue )  and  forests,  are  mutually  recursive.  This  is  due  to  the  fact  that 
we  are  implementing  the  CDS02  operational  semantics,  which  allows  expressions  as  part  of  a  cell 
name.  Hence,  a  cell  name  can  be  a  name,  a  variable,  a  graft,  a  constrained  name,  or  a  functional 
name  consisting  of  a  forest  and  a  cell  name.  The  representation  of  tree  instructions  is  just  as  in  the 
theory.  A  basic  forest  is  one  given  by  enumeration  of  events,  and  is  a  pair  of  an  integer  and  a  list 
of  trees;  the  integer  specifies  the  degree  of  currification,  i.e.,  the  number  of  inputs.  For  instance,  a 
forest  of  type  int  will  have  degree  0,  while  a  forest  of  type  int  ^  int  ^  int  has  degree  2. 

We  would  also  like  to  point  out  the  internal  representation  used  by  the  questions  and  answers 
module  for  the  relevant  portion  of  a  type,  which  we  call  an  annotated  type: 


datatype  refineUnit  =  Ground  of  typeExp  *  typeExp  list 

I  Ho  of  annotated 

withtype  outputRef inement  =  (int  list  *  refineUnit)  list 
and  inputRef inement  =  (int  *  int  list  *  refineUnit)  list 
and  annotated  =  int  *  outputRef inement  *  inputRef inement 

A  typeExp  is  a  type  expression,  being  either  a  ground  dcds  name,  a  variable,  an  arrow,  and  so 
on.  The  basic  building  block  of  an  annotated  type  is  a  ground  refineUnit ,  which  lists  the  regular 
type  name,  and  also  a  list  of  all  refinements  of  that  type.  This  is  used  to  generate  relevant  cells. 
The  integer  lists  in  the  intput  and  output  refinement  types  keep  track  of  the  product  tag,  if  any, 
of  that  piece  of  type.  In  addition,  an  inputRefinement  also  stores  the  index  of  the  input  it  came 
from.  Finally,  an  annotated  type  is  a  triple  of  an  integer  specifying  the  degree  of  currification,  an 
outputRefinement  and  an  inputRefinement. 
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datatype  icell  =  Icell_name  of  string 
I  Icell_var  of  string 
I  Icell_fun  of  forest  *  icell 
I  Icell.graft  of  icell  *  tag 
I  Icell_with  of  icell  *  iboolexp 
and  ivalue  =  Ival_string  of  string 
I  Ival_output  of  ivalue 
I  Ival_valof  of  icell 
I  Ival_arexpr  of  arexpr 
I  Ival_omega 

I  Ival.with  of  string  *  iboolexp 
I  Ival_pair  of  ivalue  *  ivalue 
and  iboolexp  =  Iboolexp_gt  of  arexpr  *  arexpr 

I  Iboolexp^gteq  of  arexpr  *  arexpr 
I  Iboolexp_lt  of  arexpr  *  arexpr 
I  Iboolexp_lteq  of  arexpr  *  arexpr 
I  Iboolexp_eq  of  ivalue  *  ivalue 
I  Iboolexp_noteq  of  ivalue  *  ivalue 
I  Iboolexp^or  of  iboolexp  *  iboolexp 
I  Iboolexp_and  of  iboolexp  *  iboolexp 
and  tree_instruction  =  tree_Valof  of  icell  *  int  *  tree_query  list 

I  tree_From  of  icell  *  int  *  tree„query  list 
I  tree .Result  of  int  *  ivalue 
and  forest  =  forest.basic  of  int  *  tree  list 
I  forest. apply  of  forest  *  forest 
I  forest. comp  of  forest  *  forest 
I  forest.fix  of  forest 
I  forest.curry  of  forest 
I  f orest_uncurry  of  forest 
I  forest.pair  of  forest  list 
I  forest.prod  of  forest  list 
withtype  tree_query  =  ivalue  *  tree.instruction 
and  tree  =  icell  *  tree.instruction 


Figure  7.2:  Internal  representation  of  forests 
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7.1.2  Differences  between  implementation  and  theory 

The  implementation  of  our  CDSO  interpreter  is  a  very  close  match  to  the  theory  previously  de¬ 
veloped.  However,  there  are  certain  differences.  None  of  them  are  fundamental  in  any  theoretical 
sense;  rather  they  are  due  to  lack  of  time  required  for  implementation.  We  enumerate  the  more 
important  discrepancies  below: 

1.  Overloaded  types  are  treated  like  intersection  types.  Overloading  is  a  side  issue  as  far  as 
refinement  type  inference  is  concerned.  We  have  described  how  to  handle  it  when  presenting 
our  type  inference  algorithm  for  CDSO  for  the  sake  of  completeness,  because  it  is  part  of 
CDSO. 

2.  Not  all  dependencies  between  input  and  output  variables  (c/.  Algorithm  6.3.1)  are  detected.  In 
particular,  those  dependencies  involving  variables  in  arithmetic  expressions  are  not  detected. 
This  would  require  a  bit  of  machinery  to  implement  fully,  but  it  is  not  problematic  from  a 
theoretical  point  of  view.  It  is  not  particularly  relevant,  because  we  have  concentrated  on 
examples  in  which  this  does  not  occur. 

3.  We  treat  certain  types  which  have  a  mixture  of  polymorphic  and  non-polymorphic  types  as 
having  a  fully  polymorphic  type,  even  though  they  may  still  admit  refinements.  This  could 
be  detected,  and  the  refinement  type  found,  but  its  omission  does  not  materially  affect  the 
kinds  of  examples  we  can  handle. 

7.2  CDSO  examples 

Even  though  our  implementation  is  a  prototype,  and  no  particular  attention  has  been  paid  to  fast 
execution,  our  results  demonstrate  that  our  approach  to  refinement  type  inference  is  practical. 
Unless  specified,  all  examples  presented  below  run  in  under  one  second.  There  are  certain  excep¬ 
tions,  however,  which  will  be  pointed  out.  When  we  mention  running  time,  we  are  referring  to  our 
benchmark  system,  which  is  a  Pentium  Pro  200MHz  with  256K  L2  cache,  64  MB  of  RAM  and  128 
MB  swap  space,  running  Linux  Red  Hat  4.0.  The  running  time  is  elapsed  time. 

Most  of  the  CDSO  examples  we  present  are  algorithm  definitions.  This  is  due  to  the  fact  that 
the  low-level  nature  of  the  language  makes  it  difficult  to  write  complex  expressions.  We  begin  with 
examples  from  Curien’s  book  [26]. 

When  the  CDSO  interpreter  starts  up,  it  loads  the  base  PCF  environment,  and  leaves  the  user 
at  the  CDSO  prompt,  denoted  by  #.  This  will  be  discussed  in  detail  in  the  next  section.  Typing  is 
optional  so  we  turn  it  on,  and  define  some  of  the  types  we  have  already  encountered. 

CDSO  version  1.1  -  June  11,  1997 

#  typing  on; 

#  let  bool  =  dcds  cell  B  values  tt,ff  end; 

Type  bool  defined. 

#  let  int  =  dcds  cell  N  values  [..]  end; 

Type  int  defined. 

#  let  true  =  dcds  cell  B  values  tt  end; 

Type  true  defined. 

#  let  false  =  dcds  cell  B  values  ff  end; 

Type  false  defined. 

#  refine  true,  false; 
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Now  very  simple  examples  will  have  the  expected  types: 

# 

r :  true 
:  bool 
request?  ; 

#  {N=l} ; 
r :  int 

:  int 

request?  ; 

#  {X=3}; 

Error:  Type  inference:  term  does  not  have  a  type 
request?  ; 

Refinement  typing  judgments  are  denoted  by  “r:”,  and  regular  typing  judgments  by  The 

state  {B  =  tt}  has  the  expected  refinement  type  true  and  regular  type  600/,  while  {N  =  1}  has  int 
as  both  regular  and  refinement  type.  Note  that  typing  in  some  state  involving  cells  not  occurring 
in  any  previously  defined  dcds  results  in  a  type  error,  but  the  interpreter  permits  evaluation  to 
continue. 

We  arrive  at  more  interesting  results  when  we  type  in  algorithms.  We  try  this  for  boolean 
negation  and  left  conjunction: 

#  let  not  = 
algo 

request  B  do 
valof  B  is 

tt  :  output  ff 
ff  :  output  tt 
end 
end 
end; 

r:  /\ [false  ->  true,  true  ->  false] 

:  bool  ->  bool 
Abbreviation  "not"  defined. 

#  let  land  = 
algo 

request  B  do 
valof  (B.l)  is 

tt:  valof  (B.2)  is 
tt :  output  tt 
f f :  output  f f 
end 

ff:  output  ff 
end 
end 
end; 

r:  /\[ (false  *  bool)  ->  false,  (true  *  true)  ->  true,  (true  *  false)  ->  false] 

:  (bool  *  bool)  ->  bool 
Abbreviation  "land”  defined. 


132 


CHAPTER  7.  IMPLEMENTATION  AND  EXAMPLES 


Obtaining  the  refinement  types  above  involves  a  simple  application  of  Algorithm  6.3.1.  The  case 
of  land  involves  instantiation  of  a  type  variable  to  bool ,  as  discussed  in  the  previous  chapter. 

We  can  define  a  curried  version  of  land  in  two  ways:  one  is  the  curry  Jand  from  Figure  2.5,  but 
we  can  also  apply  curry  to  land ,  thus  obtaining  a  combinator  expression.  Obtaining  a  refinement 
type  for  the  combinator  expression  involves  a  different  algorithm  altogether  than  for  curry  Jand, 
but,  as  expected,  the  types  are  the  same: 

#  curry (land) ; 

r:  /\ [false  ->  bool  ->  false,  true  ->  false  ->  false,  true  ->  true  ->  true] 

:  bool  ->  bool  ->  bool 
request?  ; 

#  let  curry_land  = 
algo 

request  OB  do 
valof  B  is 

tt :  output  valof  B 
f f :  output  output  f f 
end 
end 

request  {B=tt}B  do 
from  {B=tt}  do 
output  output  tt 
end 
end 

request  {B=ff}B  do 
from  {B=tt}  do 
output  output  ff 
end 
end 
end; 

r:  /\ [false  ->  bool  ->  false,  true  ->  false  ->  false,  true  ->  true  ->  true] 

:  bool  ->  bool  ->  bool 
Abbreviation  Mcurry_landn  defined. 

We  can  implement  a  right  conjunction,  and  also  a  left  strict  conjunction.  The  interesting  thing 
is  that  we  can  distinguish  between  these  programs  and  left  conjunction  based  on  their  refinement 
type: 

#  let  rand  =  algo 
request  B  do 

valof  (B.2)  is 

tt:  valof  (B.l)  is 
tt :  output  tt 
ff:  output  ff 
end 

ff:  output  ff 
end 
end 
end; 
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r:  /\[(bool  *  false)  ->  false,  (true  *  true)  ->  true,  (false  *  true)  ->  false] 
:  (bool  *  bool)  ->  bool 
Abbreviation  "rand"  defined. 

#  let  lsand  = 
algo 

request  B  do 


valof 

(B.l) 

is 

tt : 

valof 

(B.2)  is 

tt : 

output  tt 

ff : 

end 

output  ff 

ff : 

valof 

(B.2)  is 

tt: 

output  ff 

ff : 

end 

output  ff 

end 

end 

end; 

r:  /\[(false  *  false)  ->  false,  (true  *  true)  ->  true,  (false  *  true)  ->  false, 
(true  *  false)  ->  false] 

:  (bool  *  bool)  ->  bool 
Abbreviation  "lsand"  defined. 

rand  has  the  refinement  type  ( bool  x  false)  — >  false,  while  land  has  type  ( false  x  bool )  -4  false. 
The  idea  is  that  we  can  use  the  refinement  type  to  infer  strictness  information,  lsand  does  not 
have  any  refinement  type  involving  bool. 

One  kind  of  program  that  we  can  write  in  CDSO  but  not  in  PCF  is  an  algorithm  that  does  se¬ 
mantic  manipulation.  By  this  we  mean  that  the  algorithm  does  different  things  depending  on 
how  its  input  reacts  to  various  inputs.  A  fascinating  example  of  such  an  algorithm  is  called 
AND -TASTER,  and  was  first  described  by  Berry  and  Curien  [7].  The  algorithm  takes  as  in¬ 
put  an  algorithm  on  ( bool  x  bool)  — >  bool  and  determines  if  it  is  a  conjunction  algorithm,  and  if  so 
which  one.  The  full  text  of  the  algorithm  is  given  in  Appendix  B.5.  Here  we  define  the  type  of  its 
output,  together  with  refinements. 

#  let  and_type  = 
dcds 

cell  WHICH.AND  values  IS_LEFT_AND,  IS_LEFT_STRICT_AND , 

IS_RIGHT_AND ,  IS_RIGHT_STRICT_AND , 

IS_NOT_AN_AND 

end; 

Type  and_type  defined. 

#  let  is_and_type  = 
dcds 

cell  WHICH.AND  values  IS_LEFT_AND,  IS_LEFT_STRICT_AND , 

IS_RIGHT_AND ,  IS_RIGHT_STRICT_AND 

end; 

Type  is_and_type  defined. 
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#  let  is.not_and.type  = 
dcds 

cell  WHICH.AND  values  IS.NOT.AN.AND 
end; 

Type  is.not.and.type  defined. 

#  refine  is.and.type,  is.not.and.type; 

Given  these  refinements,  AND.T ASTER  has  an  incredibly  detailed  type,  which  is  also  listed  in 
Appendix  B.5.  In  this  case,  the  intensional  information  is  overwhelming:  there  is  a  type  for  each 
possible  branch  through  the  program,  and  so  the  refinement  type  ends  up  being  not  much  more 
succinct  than  the  code  itself. 

We  now  turn  our  attention  to  the  implementation  of  lazy  natural  numbers  we  presented  earlier, 
and  show  how  our  refinement  types  helped  catch  an  error.  We  begin  by  defining  lazy  natural 
numbers  and  two  refinements,  empty,  and  non-empty: 

#  letrec  Inat  = 
dcds 

cell  B  values  0,1 
graft  (Inat.s)  access  B  =  1 
end; 

Type  lnat  defined. 

#  let  empty_lnat  =  dcds 
cell  B  values  0 

end; 

Type  empty.lnat  defined. 

#  local  letrec  partial_lnat  =  dcds 

cell  (B.s)  values  0,1  access  B  =  1 
graft  (partial_lnat . s)  access  B  =  1 
end 

in  let  some_lnat  =  dcds 
cell  B  values  1 

cell  (B.s)  values  0,1  access  B  =  1 
graft  (partial.lnat . s) 
end 
end; 

Type  some.lnat  defined. 

#  refine  empty.lnat,  some.lnat; 

Some  of  the  algorithms  on  lazy  natural  numbers  presented  earlied  have  the  expected  refinement 
types.  The  refinement  type  of  successor  can  be  simplified  by  removing  the  middle  type,  but  our 
interpreter  currently  does  not  handle  this. 

#  let  Somega  =  fix(Srec); 
r:  some. lnat 

:  lnat 

Abbreviation  "Somega'1  defined. 

#  let  S  =  f ix(succ.rec) ; 

r:  /\[some_lnat  ->  some.lnat,  lnat  ->  some.lnat,  empty.lnat  ->  some.lnat] 

:  lnat  ->  lnat 
Abbreviation  "S"  defined. 
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It  turns  out  that  our  first  implementation  of  left  minimum  was  erroneous,  in  that  if  it  received 
a  0  in  the  left  input,  it  still  checked  the  right  input,  instead  of  placing  a  0  on  the  output  right  away. 
Given  a  right  input  which  looped,  this  program,  of  course,  would  loop  in  that  situation.  We  did 
not  catch  this  error  until  we  implemented  refinement  type  inference,  and  observed  the  following 
type  for  the  program: 

#  let  bad_left_min  =  f ix(bad_left_min_rec) ; 
r:  /\[(some_lnat  *  some_lnat)  ->  some_lnat, 

(empty^lnat  *  empty.lnat)  ->  empty.lnat , 

(some_lnat  *  empty_lnat)  ->  empty_lnat , 

(empty_lnat  *  some_lnat)  ->  empty_lnat] 

:  (lnat  *  lnat)  ->  lnat 
Abbreviation  "bad_left_min"  defined. 

This  provided  a  clue  that  the  program  did  not  do  what  we  intended.  The  revised,  correct  program, 
shown  in  Appendix  B.l,  has  the  expected  type: 

#  let  left.min  =  f ix(left_min_rec) ; 

r:  /\  [(some_lnat  *  some  ...lnat)  ->  some^lnat,  (empty_lnat  *  lnat)  ->  empty_lnat, 
(some_lnat  *  empty_lnat)  ->  empty_lnat] 

:  (lnat  *  lnat)  ->  lnat 
Abbreviation  "lef timin’1  defined. 

7.3  PCF 

We  mentioned  previously  that  upon  startup  the  CDSO  interpreter  loads  in  the  base  PCF  environ¬ 
ment.  This  consists  of  the  type  definitions  and  the  combinators  required  to  compile  PCF  to  CDSO. 
There  are  actually  two  base  environments  for  PCF  that  we  implemented:  one  contains  refinement 
types  and  one  does  not.  We  first  discuss  the  types  obtained  for  the  terms  in  the  base  environments 
before  turning  our  attention  to  PCF. 

7.3.1  Base  environment 

There  are  two  ways  of  starting  up  the  CDSO  interpreter:  regular  or  refinement  typing.  The  choice 
of  typing  only  applies  to  PCF.  We  discuss  refinement  typing,  since  it  subsumes  regular  typing. 
The  complete  listing  of  CDSO  programs  which  make  up  the  refinement  compilation  environment  is 
given  in  Appendix  B.3.  The  complete  transcript  of  the  interpreter  processing  the  CDSO  programs 
is  given  in  Appendix  B.4.  Here  we  discuss  an  abbreviated  list.  Note  that  the  refinement  type  and 
regular  type  are  listed  before  the  name  of  the  algorithm. 

-  cdsO(ref ined) ; 

—  Loading  PCF  constants. 

Type  bool  defined. 

Type  int  defined. 

Type  true  defined. 

Type  false  defined. 

r:  /\[((false  *  Ja)  *  *b)  ->  ;b,  ((true  *  ?c)  *  >d)  ->  Jc] 

:  ((bool  *  5 a)  *  ’a)  ->  ’a 
Abbreviation  "cond"  defined. 
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r:  (3a  *  3b)  ->  »a 
:  ( 3a  *  3b)  ->  3 a 
Abbreviation  nfstn  defined, 
r:  (3a  *  3b)  ->  3b 
:  (3a  *  3b)  ->  3b 
Abbreviation  "snd"  defined, 
r:  (int  *  int)  ->  int 
:  (int  *  int)  ->  int 
Abbreviation  "plus'1  defined. 

r:  (int  *  int)  ->  bool 
:  (int  *  int)  ~>  bool 
Abbreviation  "equal"  defined. 

r:  3a  ->  3a 
:  3a  ->  3a 

Abbreviation  "id"  defined, 
r :  3a 
:  3a 

Abbreviation  "emptyenv"  defined, 
r:  (3a  ->  3a)  ->  3a 
:  (3a  ->  3 a)  ->  3 a 
Abbreviation  "Y"  defined, 
r:  ( 3a  ->  3b  ->  3b)  ->  3a  ->  3b 
:  ( 3a  ->  3b  ->  3b)  ->  3a  ->  3b 
Abbreviation  "Yenv"  defined. 

Type  intlist  defined. 

Type  empty.intlist  defined. 

Type  one_intlist  defined. 

Type  many, int list  defined, 
r:  empty_intlist 
:  intlist 

Abbreviation  "nil"  defined. 

r:  /\ [one_intlist  ->  false,  empty. int list  ->  true,  many.intlist  ->  false] 

:  intlist  ->  bool 
Abbreviation  "null"  defined. 

r:  /\[(int  *  one.intlist)  ->  many.intlist ,  (int  *  many. int list)  ->  many.intlist , 
(int  *  empty.intlist)  ->  one. intlist] 

:  (int  *  intlist)  ->  intlist 
Abbreviation  "cons"  defined. 

r:  /\ [one.intlist  ->  int,  many.intlist  ->  int] 

:  intlist  ->  int 
Abbreviation  "hd"  defined. 

r:  /\ [many.intlist  ->  intlist,  one.intlist  ->  empty.intlist] 

:  intlist  ->  intlist 
Abbreviation  "tl"  defined. 
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The  first  interesting  type  to  observe  is  that  for  conditional.  When  collecting  cell  names  and 
values  along  the  two  possible  paths  to  an  output  through  cond,  we  can  establish  a  dependence 
between  the  variable  cell  names  and  values:  when  input  is  true,  the  output  cell  $(7,  gets  its  value 
from  the  left  input  and  when  the  input  is  false ,  it  gets  its  value  from  the  right  input.  By  unifying 
the  matching  type  variables,  we  can  obtain  the  more  precise  refinement  type. 

The  identity,  projections,  Y  and  Yenv  combinators  have  fully  polymorphic  types,  and  so  do  not 
admit  refinements.  Also  having  a  polymorphic  type  is  the  empty  environment,  which  is  implemented 
as  the  empty  state.  The  reason  for  this  is  that  the  empty  state  can  be  part  of  any  dcds. 

We  cannot  obtain  interesting  refinement  types  for  the  addition  and  equality  check  for  integers 
algorithms  presented  in  the  previous  chapter.  The  type  of  addition  does  not  admit  refinements.  As 
for  the  equality  test,  the  types  of  its  two  paths  are  (int  x  int)  -4 true  and  (int  x  int)  -4  false .  In  the 
last  step  of  Algorithm  6.3.1,  we  make  this  into  (int  x  int)  ~>bool,  because  we  must  lose  precision. 

Finally,  the  algorithms  on  integer  lists  have  the  expected  types,  as  discussed  in  the  previous 
chapter. 

7.3.2  Examples 

We  can  switch  from  the  CDSO  interpreter  to  the  PCF  one  by  typing  the  command  pcf .  At  this 
time,  the  prompt  changes  to  $  and  the  base  enviroment  for  PCF  becomes  the  current  environment. 
The  refinement  types  for  all  of  the  examples  in  this  section  are  obtained  through  the  use  of  Al¬ 
gorithm  6.6.3,  by  entering  a  questions  and  answers  session  with  the  expression  which  exposes  a 
relevant  state,  which  is  then  typed  using  Algorithm  6.3.1. 

We  begin  with  two  examples  on  bool— » bool:  boolean  negation  and  a  function  which  always 
returns  true. 

$  val  not  =  fn  x  =>  if  x  then  false  else  true; 
r:  /\ [false  ->  true,  true  ->  false] 

:  bool  ->  bool 
Abbreviation  "not"  defined. 

$  val  exclmid  =  fn  x  =>  x  or  (not  x) ; 
r:  A  [false  ->  true,  true  ->  true] 

:  bool  ->  bool 

Abbreviation  "exclmid"  defined. 

The  interesting  thing  to  note  here  is  that  this  version  of  not  is  implemented  in  a  completely 
different  fashion  than  the  version  we  wrote  in  CDSO  directly  (cf.  Section  6.4.2).  As  expected,  we 
obtain  the  same  refinement  type. 

Now  we  consider  some  programs  on  integer  lists.  The  map  function  we  first  presented  in  the 
introduction  has  the  refinement  type  we  had  wanted: 

val  map  =  letrec  mapf  =  fn  f  =>  fn  1  => 

if  null  1  then  []  else  (f  (hd  1))  ::  ((mapf  f)  (tl  1)) 

in  mapf 
end; 

r:  A [(int  “>  int)  ->  many_intlist  ->  many.intlist , 

(int  ->  int)  ->  one__intlist  ->  one_intlist, 

(int  ->  int)  ->  empty_intlist  ->  empty.intlist] 

:  (int  ->  int)  ->  intlist  ->  intlist 
Abbreviation  "map"  defined. 
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One  of  the  strengths  of  our  approach  to  refinement  type  inference  over  the  Freeman-Pfenning 
one  can  be  seen  in  the  following  examples: 

$  val  13  =  1  ::  (2  ::  (3  ::  [])); 
r:  many_intlist 
:  intlist 

Abbreviation  "13"  defined. 

$  tl  13; 
r:  many_intlist 
:  intlist 
request?  ; 

$  tl  (tl  13); 
r:  one_intlist 
:  intlist 

Because  we  are  not  using  type  inference  rules,  we  are  able  to  obtain  more  precise  refinement  types. 
Recall  that  one  of  the  refinement  types  of  tl  is  many -intlist  ->  intlist ,  a  type  which  entails  inevitable 
loss  of  precision.  By  using  such  a  type  with  inference  rules  it  is  impossible  to  obtain  anything  other 
than  intlist  for  the  refinement  types  of  the  two  expressions  above.  Since  we  query  the  expression 
directly,  we  bypass  this  problem. 

Another  advantage  of  our  approach  is  that  we  place  no  restrictions  on  polymorphic  functions. 
When  presenting  the  Freeman-Pfenning  approach  in  Section  2.6,  we  described  how,  in  the  case  of  an 
example  such  as  double  not ,  the  function  double  could  not  be  polymorphic.  Even  then,  obtaining 
the  refinement  type  was  complicated  by  instantiations  of  type  variables  which  led  to  very  long 
refinement  types.  In  our  case,  the  answer  can  be  obtained  very  quickly;  the  questions  and  answers 
session  only  needs  to  know  the  values  of  three  cells  in  order  to  give  a  precise  refinement  type: 

$  val  double  =  fn  f  =>  fn  x  =>  f  (f  x) ; 
r:  (’a  ->  ’a)  ->  ’a  ->  ’a 
:  (’a  ->  ’a)  ->  ’a  ->  ’a 
Abbreviation  "double"  defined. 

$  double  not ; 

r:  /\ [false  ->  false,  true  ->  true] 

:  bool  ->  bool 

When  performing  the  questions  and  answers  session  in  Algorithm  6.6.3,  we  cannot  construct 
new  queries  when  we  need  to  know  the  values  of  cells  that  have  an  infinite  number  of  possible 
values.  Depending  on  the  flow  of  control  we  may  or  may  not  be  able  to  obtain  precise  types  for 
expressions  with  inputs  that  have  such  cells: 

$  val  f  =  fn  x  =>  fn  y  =>  fn  1  =>  (x+y)::l; 
r:  /\[int  ->  int  ->  many_intlist  ->  many_ intlist , 
int  ->  int  ->  one_intlist  ->  many_intlist, 
int  ->  int  ->  empty_intlist  ->  one_intlist] 

:  int  ->  int  ->  intlist  ->  intlist 
Abbreviation  "f"  defined. 

$  val  h  =  fn  x  =>  fn  1  =>  if  x=3  then  1  else  x::l; 
r:  int  ->  intlist  ->  intlist 
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:  int  ->  intlist  ->  intlist 
Abbreviation  "h"  defined. 

In  the  first  example  above,  we  obtain  something  very  precise,  and  very  similar  to  the  type  of  cons. 
In  the  second  example  we  are  stuck,  since  we  need  to  know  the  value  of  the  input,  which  comes 
from  int.  We  cannot  construct  a  relevant  approximation,  so  we  return  the  regular  type  as  the 
refinement  type. 

Even  when  faced  with  rather  complicated-looking  expressions,  we  can  usually  obtain  a  refine¬ 
ment  type  very  quickly.  Our  system  infers  a  precise  type  for  test2  below  with  no  perceptible  wait 
time,  useless  is  a  function  which  always  returns  an  empty  list. 

$  val  useless  =  letrec  f  =  fn  1  =>  if  null  1  then  []  else  f  (tl  1) 
in  f 
end; 

r:  /\[many_intlist  ->  empty_ int list ,  empty_intlist  ->  empty_ int list , 
one.intlist  ->  empty_intlist] 

:  intlist  ->  intlist 
Abbreviation  "useless"  defined. 

$  val  times2  =  fn  x  =>  x  *  2; 
r:  int  ->  int 
:  int  ->  int 

Abbreviation  "times2"  defined. 

$  val  test2  =  1  ::  (useless  ((map  times2)  (useless  ((map  times2)  13)))); 
r:  one_intlist 
:  intlist 

Abbreviation  "test2"  defined. 

Sometimes  we  are  only  able  to  obtain  partial  information.  Consider  the  following  example  of 
the  familiar  predicate  exists: 

$  val  exists  =  letrec  f  =  fn  p  =>  fn  1  => 
if  null  1  then  false 
else  if  p  (hd  1)  then  true 

else  (f  p)  (tl  1) 

in  f  end; 

r:  /\[(int  ->  false)  ->  empty.intlist  ->  false, 

(int  ->  true)  ->  empty_intlist  ->  false, 

(int  ->  bool)  ->  empty_intlist  ->  false] 

:  (int  ->  bool)  ->  intlist  ->  bool 
Abbreviation  "exists"  defined. 

Again,  the  type  could  have  been  simplified  by  removing  the  last  branch.  We  are  only  able  to  infer 
what  happens  if  the  input  is  empty.  When  querying  the  expression  with  a  non-empty  input,  we 
receive  a  valof  xc  answer,  where  x  is  a  complex  expression  asking,  in  essence,  if  the  property  holds 
of  the  input.  We  cannot  do  anything  with  such  an  answer,  so  we  give  up.  This  brings  us  to  one 
of  the  weaknesses  of  our  approach:  we  are  not  always  able  to  obtain  precise  refinement  types  for 
expressions  with  higher-order  inputs.  Refinement  type  inference  rules,  as  in  the  Freeman-Pfenning 
approach,  would  work  better  in  such  cases.  As  an  example,  consider  the  following  program: 
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$  val  fll  =  letrec  f  =  fn  b  =>  fn  g  => 

if  g  b  then  true  else  (f  false)  g  in  f  end; 
r:  bool  ->  (bool  ->  bool)  ->  bool 
:  bool  ->  (bool  ->  bool)  ->  bool 
Abbreviation  “fll"  defined. 


By  making  assumptions  about  all  possible  refinement  typings  of  g ,  a  system  with  refinement  type 
inference  rules  might  be  able  to  obtain  more  precise  types. 

We  can  obtain  very  detailed  types  very  quickly,  for  certain  many  recursive  functions,  such  as 
append . 


$  val  append  =  letrec  f  =  fn  11  =>  fn  12  =>  if  null  11  then  12 

else  (hd  11)  : :  ( (f  (tl  11))  12) 


in  f  end; 

r:  /\ [many _int list  ->  intlist  ->  many.intlist , 

many.intlist  ->  one.intlist  ->  many_intlist , 
many.intlist  ->  many_intlist  ->  many.intlist , 
many_intlist  ->  empty_intlist  ->  many.intlist, 
one.intlist  ->  many.intlist  ->  many.intlist, 
one_intlist  ->  one.intlist  ->  many.intlist, 
one_intlist  ->  empty_intlist  ->  one.intlist, 
empty.intlist  ->  many.intlist  ->  many.intlist, 
empty_intlist  ->  one_intlist  ->  one_intlist, 
empty.intlist  ->  empty.intlist  ->  empty.intlist] 
:  intlist  ->  intlist  ->  intlist 
Abbreviation  "append”  defined. 


However,  we  have  encountered  recursive  functions  where  the  performance  of  refinement  type  infer¬ 
ence  suffers  from  the  limitations  of  CDS02. 


$  val  revl  =  letrec  f  =  fn  1  =>  fn  result  => 
if  null  1  then  result 
else  (f  (tl  1))  ((hd  1)  ::  result) 
in  f  end; 


:  intlist  ->  intlist  ->  intlist 
Abbreviation  "revl"  defined. 

$  val  rev  =  fn  1  =>  (revl  1)  []  ; 

r:  /\  [many.. intlist  ->  many_intlist ,  empty. intlist  ->  empty. intlist , 
one.intlist  ->  one.intlist] 

:  intlist  ->  intlist 
Abbreviation  "rev"  defined. 

We  have  omitted  the  refinement  type  for  rev \  because  it  is  very  similar  to  that  of  append .  Obtaining 
the  refinement  type  for  rev i  takes  6  seconds,  and  for  rev  16  seconds.  These  are  the  only  programs 
presented  so  far  on  which  our  interpreter  takes  more  than  a  second.  The  poor  performance  is  due 
to  the  fact  that  the  computation  of  fixpoints  is  not  memoized  in  CDS02. 

We  end  with  some  looping  programs,  one  of  which  has  a  type  which  does  not  admit  refinements, 
hence  it  is  “sidestepped,”  one  that  is  detected,  and  one  that  reaches  the  depth  bound. 
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$  val  loop2  =  letrec  f  =  fn  x  =>  f  (x+1)  in  f  end; 
r:  int  ->  ’a 
:  int  ->  ’a 

Abbreviation  "loop2"  defined. 

$  val  loop6  =  letrec  f  =  fn  1  =>  if  null  1  then  []  else  f  1  in  f  end; 

This  expression  loops, 
r:  intlist  ->  intlist 
:  intlist  ->  intlist 
Abbreviation  Mloop6"  defined. 

$  val  loop8  =  letrec  f  =  fn  1  =>  if  null  1  then  []  else  f  (1 : : 1)  in  f  end; 
This  expression  may  loop.  Refinement  type  inference  gives  up. 
r:  intlist  ->  intlist 
:  intlist  ->  intlist 
Abbreviation  "loop8"  defined. 
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In  this  chapter  we  conclude  and  present  possible  avenues  for  further  work. 

8.1  General  conclusions 

We  believe  we  have  provided  ample  evidence  of  both  the  theoretical  and  practical  utility  of  studying 
intensional  semantics.  Thus,  the  central  claim  of  the  thesis  has  been  demonstrated.  However,  the 
central  claim  was  very  broad,  so  we  present  a  more  detailed  assessment  of  this  work. 

8.1.1  Relative  intensional  expressiveness 

We  defined  the  notion  of  relative  intensional  expressiveness  between  programming  languages,  de¬ 
veloped  a  new  intensional  semantics,  circuit  semantics,  and  we  set  out  to  prove  separation  results. 
Our  goal  was  to  compare  languages,  and  not  underlying  computation  models.  We  have  been  able 
to  compare  primitive  recursive  algorithms  with  sequential  algorithms  and  parallel  algorithms,  PCF 
extended  with  por,  pifQ,  pifL  and  deterministic  query ,  and,  finally,  PCF  extended  with  deterministic 
and  non-deterministic  query.  However,  in  the  process,  we  have  been  only  partially  successful  in 
staying  true  to  our  original  goal  of  only  comparing  languages.  Of  the  comparisons  we  have  made, 
three  rely  to  some  extent  on  assumptions  about  computation  models: 

1.  When  comparing  CDSO  and  CDSP,  we  allowed  a  construct  to  evaluate  cells  in  parallel  in 
CDSP,  but  CDSO,  due  to  the  inherently  sequential  nature  of  its  operational  semantics,  could 
not  do  something  similar.  Thus,  our  comparison  became  partly  a  comparison  of  a  sequential 
and  a  parallel  machine  model. 

2.  Similarly,  when  comparing  PCF  extended  with  pift  versus  query,  we  only  allowed  parallel 
computations  to  be  started  by  query,  or  the  limited  mechanism  of  pifL.  This  made  the 
comparison  somewhat  artificial,  because  there  are  possible  parallel  evaluation  styles  for  PCF 
as  a  whole. 

3.  The  most  egregious  break  with  our  original  goal  was  made  when  comparing  deterministic 
and  non-deterministic  query.  In  order  to  obtain  deterministic  results,  we  focused  on  a  sub¬ 
set  of  non-deterministic  queries  which  return  deterministic  answers  under  the  assumption 
of  hardware  that  detects  undefined  inputs.  Deterministic  queries  cannot  take  advantage  of 
this  hardware.  Using  results  from  circuit  complexity,  we  were  then  able  to  prove  that  non- 
deterministic  query  is  more  expressive.  But  we  believe  the  insight  gained  into  connections 
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between  complexity  theory  and  programming  languages  theory,  and  in  particular,  between 
DPCF  and  monotone  circuits,  offsets  the  shortcomings  of  the  method.  Note  that  this  con¬ 
nection  still  applies  if  we  add  recursion  to  DPCF. 

The  major  lesson  we  have  learned  from  this  is  that  it  is  interesting  and  worthwhile  to  attempt 
intensional  comparisons  of  programming  languages,  but  that  it  is  difficult  to  achieve  it  in  a  com¬ 
pletely  fair  fashion. 

8.1.2  Refinement  type  inference 

We  developed  a  novel  type  inference  system  based  on  concrete  data  structures,  which  have  a  more 
elaborate  structure  than  records,  and  we  implemented  it  in  our  CDSO  interpreter.  Gradually,  we 
realized  that  we  could  type  the  various  paths  through  an  algorithm  separately  and  achieve  what 
was  called  in  the  literature  a  refinement  type.  After  becoming  aware  of  the  work  of  Hughes  and 
Ferguson,  we  developed  our  questions  and  answers  approach  to  refinement  type  inference,  which  has 
benefits  and  drawbacks  as  compared  to  the  only  previous  approach,  that  of  Freeman  and  Pfenning. 
The  benefits  are: 

1.  No  restrictions  are  placed  on  the  usage  of  polymorphic  functions  in  order  to  obtain  precise 
refinement  types. 

2.  No  need  to  consider  all  possible  refinements  of  a  type,  which  leads  to  large  time  and  space 
savings,  especially  in  the  case  of  higher-order  types. 

3.  Ability  to  obtain  more  precise  refinement  types  in  many  cases. 

The  drawbacks  of  our  approach  are: 

1.  Poor  performance  in  the  case  of  certain  kinds  of  fixpoint  computations,  due  to  the  underlying 
CDS02  semantics. 

2.  Inability  to  obtain  precise  refinement  types  in  many  cases,  especially  when  higher-order  inputs 
are  involved. 

3.  A  more  restrictive  language  for  defining  refinement  types.  In  particular,  we  cannot  have 
definitions  such  as  the  even  and  odd  refinements  of  boolean  lists  of  Section  2.6. 

There  are  other  differences  between  the  two  systems,  but  they  are  not  as  important,  For  instance, 
we  do  not  have  polymorphic  lists.  This  can  be  easily  remedied,  however.  As  we  have  seen,  the 
relevant  cells  in  a  list,  from  the  point  of  view  of  refinement  type  inference,  are  the  backbone  cells, 
EMPTY,  EMPTY.l ,  and  EMPTY.l.l.  Regardless  of  which  kinds  of  lists  we  considered,  those 
cells  would  remain  the  same,  thus  we  can  imagine  extending  CDSO  with  “generic”  list  definitions. 

Despite  the  drawbacks,  we  believe  our  system  shows  signs  of  being  quite  practical.  We  have 
already  benefited  from  it  in  finding  a  programming  error.  There  are  two  obstacles  that  we  see 
before  the  system  becomes  truly  practical:  the  type  definitions  must  currently  be  done  in  CDSO, 
and  there  exist  performance  concerns  for  certain  recursive  functions.  We  believe  these  problems 
can  be  solved,  and  we  shall  discuss  this  issue  in  the  next  section. 

In  the  process  of  developing  the  refinement  type  inference  system,  we  established  a  new  way 
of  using  sequential  algorithms  to  perform  abstract  interpretation.  The  previous  approach,  that  of 
Hughes  and  Ferguson,  suffered  from  severe  space  problems  [51].  Our  approach  has  speed  problems 
in  the  case  of  fixpoints.  The  natural  question  is  how  to  combine  the  best  of  the  two  approaches. 
This  possibility  is  also  discussed  in  the  next  section. 
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8.2  Further  work 

We  discuss  possible  areas  of  further  work  in  three  main  categories:  refinement  type  inference,  CDSO 
applications,  and  extensions  of  CDSO. 

8.2.1  Refinement  type  inference 

The  obvious  idea  suggested  by  the  comments  in  the  previous  section,  is  to  have  a  mixed  refinement 
type  inference  rules  and  abstract  interpretation  of  the  expression  approach.  Consider  the  following 
program: 

rev  (tl  ones) ; 

where  ones  is  the  infinite  list  of  l’s  defined  earlier.  Our  system  can  infer  a  precise  type  for  a  piece 
of  the  program: 

$  tl  ones; 
r:  many_intlist 
:  intlist 
request?  ; 

The  Freeman-Pfenning  system  can  only  infer  intlist  as  the  type  of  this  expression.  However,  given 
something  known  to  have  refinement  type  many  .intlist ,  that  system  could  obtain  type  many  .intlist 
for  the  result  of  applying  rev  to  it.  Our  system  cannot  infer  many. intlist  as  the  final  answer, 
because  the  computation  loops,  so  refinement  type  inference  gives  up.  A  combined  system  would 
be  able  to  obtain  the  type  many  .intlist  for  the  whole  expression.  Of  course,  it  is  not  clear  how  to 
achieve  this  combination  of  the  two  approaches  in  detail,  but  it  seems  like  a  particularly  interesting 
area  for  future  work. 

There  are  several  ways  in  which  our  implementation  can  be  improved.  Aside  from  removing 
the  discrepancies  between  the  implementation  and  the  theory,  there  are  two  main  ways  we  could 
strive  for  better  performance: 

1.  Simplification  of  categorical  code.  Currently,  we  perform  no  simplifications  at  all  on  the 
combinator  code  which  results  from  the  compilation  of  PCF  programs.  This  code  is  very 
inefficient.  One  of  the  major  implementations  of  ML,  CAML  [61,  25],  is  based  on  the  same 
compilation  to  categorical  combinators,  and  it  relies  on  many  optimizations.  Adopting  even 
a  small  subset  of  these  for  our  purposes  would  probably  result  in  markedly  improved  perfor¬ 
mance. 

2.  Memoization  of  fixpoint  computations  in  CDS02.  We  believe  this  is  the  main  performance 
bottleneck.  Developing  a  mixed  CDS02/01  evaluation  strategy  that  keeps  tables  around  for 
fixpoints  should  solve  most  of  our  performance  problems. 

As  far  as  having  to  define  types  in  CDSO  is  concerned,  we  believe  this  is  not  an  enormous 
problem.  Having  to  write  programs  in  CDSO  is  more  of  a  concern,  but,  as  we  have  shown,  that 
can  be  avoided.  It  is  possible  to  make  the  CDSO  type  definitions  more  like  to  ML-style  definitions. 
The  original  paper  on  CDSO  [5]  takes  some  steps  in  this  direction,  and  one  can  probably  go  much 
further. 
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8.2.2  Applications  of  CDSO 

We  have  already  mentioned  that  we  believe  CDS02  confers  significant  advantages  for  the  purpose 
of  abstract  interpretation  over  CDS01.  It  seems  Hughes  and  Ferguson  have  considered  the  space 
problems  of  CDS01  insurmountable  [51].  We  plan  on  developing  strictness  analysis  based  on  CDS02, 
and  hope  to  see  the  same  great  performance  mentioned  in  [33]  without  the  massive  storage  use. 

Another  area  we  plan  to  investigate  is  the  use  of  CDS02  profiling  semantics  for  the  purpose  of 
complexity  analysis  of  lazy,  higher-order  programs.  We  have  already  made  a  start  in  Chapter  3,  by 
providing  operational  semantics  rules  extended  with  step  information.  It  would  be  quite  interesting 
to  analyze  some  of  the  problems  in  Wadler  [85]  and  Sands  [78]  with  our  approach.  One  of  the  great 
strengths  of  sequential  algorithms  is  that  they  provide  a  uniform  way  of  moving  from  first-order 
to  higher-order,  and  so  a  lot  of  the  problems  encountered  in  the  previously  cited  approaches  might 
disappear. 

Finally,  CDSO  is  an  implementation  of  a  game  semantics,  and  much  has  been  written  about 
connections  between  game  semantics  and  parallel  implementations  of  functional  languages  (see  [1], 
for  instance).  The  idea  is  that  a  sequential  computation  can  be  broken  down  into  a  network 
of  concurrent  processes  which  exchange  information.  We  are  of  the  opinion  that  our  refinement 
type  inference  framework  can  be  extended  for  the  purpose  of  analyzing  such  networks  of  processes 
communicating  through  channels,  and  we  have  already  started  work  in  this  direction. 

8.2.3  Extensions  of  CDSO 

We  briefly  mention  two  ideas  on  extensions  of  CDSO,  which  are  somewhat  distantly  related  to  our 
current  work.  First,  we  are  considering  the  possibility  of  allowing  non-ground  dcds  definitions,  and 
extending  the  language  with  channels.  The  idea  would  be  to  be  able  to  send  higher-order  messages 
along  a  channel  in  a  piece-meal  fashion.  It  is  not  clear  yet  how  this  extension  would  affect  the  type 
system  we  developed  for  CDSO. 

Second,  we  are  envisioning  a  parallel  extension  of  CDSO  for  artificial  intelligence  applications. 
One  of  the  most  interesting  features  of  CDSO  is  the  ability  to  write  semantics-manipulation  al¬ 
gorithms,  such  as  AND.TASTER.  Imagine  a  language  for  programming  virtual  worlds  in  which 
agents  can  meet  and  interact  based  on  each  other’s  semantics.  We  could  use  refinement  types  in 
such  a  system  to  obtain  very  interesting  behavioral  information  on  agents. 


Appendix  A 

Summary  of  Major  Definitions 


A.l  CDSO  operational  semantics 


A. 1.1  Evaluation  of  forests 


(TreeI) 


Tree  (c^,  ins i), . . . ,  Tree  (dn,  insn)  ?  x\  •  ■  •  xnd 


insi  ?  x\  ■  •  •  xnd 


(Tree2) 

(Result) 


(Valof) 


(From) 


_ C'L  £  c' _ 

Tree  (c'x,  ins x), . . . ,  Tree  ( dn ,  insn )  ?  x\  •  •  ■  xnd  ->  Cl 
Result  v'  ?  X\  •  •  •  — >  u' 


xp  ?  c  -> 


’  Vi 

<  v,  and  Vi.  Vi  ^  v 

ci 

\ _ 


Valof  ( c,p )  is 

V\  :  insi 


>  ?  x\  ■  •  •  a;nc'  -»• 


end 


insi  ?  xi  •  •  •  a;nc' 

<  f l 

output valof  c 


{Vi 

v,  and  Vi.  Vi  ^  v 

ci 


From  ( c,p )  is 

Vi  :  insj 


>  ?  rci  •  •  •  xnd 


end 


(  insi  ?  x\  •  •  •  xnd 
<  fail  with  no-access 


[  Cl 


A. 1.2  CDS02  rules 


(App) 


Cl 


A  ?  Bd  — >  < 


valof  c 
output  v' 


Cl 
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(Comp) 


(Fix) 

(Pair) 

(Prod) 


(Curry) 


(Uncurry) 


A  ?  ( B.x)c "  — t  < 


0 

valof  d  Blxd  — >  valof  c 

output  v" 


f 

0 

A\B  ?  xd'^< 

valof  c 
[  output  v' 

f 

0 

A  ?  fix(A)c—> 

valof  d 

[  output  V 

[  Q 

fix(A)  ?  < 

Q 

1 

V 

K 

<  Al,  . . . ,  An  >  ?  a;(ci)  Ai  ?  xc 

n 

JJ  Ai  ?  (ci)  — >•  Ai  ?  c 

2—1 

f 

0 

.A  ?  (a?  x  y)d'  -t 

valof  (c.l) 

*  valof  (d . 2) 

output  v” 

f 

Q 

curry(A)  ?  xycn  — ^  < 

valof  c 

output  valof  d 

output  output  vn 

< 

< 

n 

A  ?  (7Ti.a;)(7r2.y)c"  — >  < 

valof  c 

output  valof  d 

1 

output  output  vff 

uncurry(A)  ?  xd'  — >  < 


O 

wa/o/  (c.l) 
va/o/  (c'.2) 
(  output  v" 


A. 2  CDSO  typing  rules 

A.2.1  Subtyping  and  intersection  types 

(Sub-Refl)  a  <a 

a  <  t  t  <  5 


(Sub-Trans) 


<7  <  6 

Q\  <  Tj  02  <  r2 
(Jl  X  CT2  <  Tj  X  T2 


(Sub- Prod) 
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(And-Intro) 

X  .  (X  x  *  *  *  X  •  (X 

x  :  AN"CTn] 

(And-Elim) 

x  :  AN--0"™] 

a;  :  aj 

(Sub-And-R) 

Vi  (J  <T{ 

<  Ah-Tn] 

(Sub-And-L) 

> 

Q 

3 

IA 

5 

(Sub- Arrow) 

<72  <  CTi  Ti  <  7*2 

(Sub-And-Dist) 

Ah  “►  Ti  "  a  -4  rn]  <  a  ->  Ah -Tn] 

(Sub- Over) 

{a*  — >  Tj  |  *  G  l..n}  <  {N  — »  Cj  |  j  G  l..m}  Vj.  3i.  -»  Tj  <  >  ("j 

(Sub-Meet-Over) 

AN^ri  1  *  G  l..n]  <  {cTj^-Tj  I  *  G  l..n} 

A. 2. 2  Monomorphic  type  inference 


(App) 


a  ■  A N~>"  A  1  i  €  l..n]  b  :  AN-^m] 
a.b  :  A [n  |  3 j.  6j  <  cr*] 


(Comp) 


a  '■  |  i  G  l..n]  b  :  A[crj^rj  I  j  G  l..m] 

a|6  :  AN  “>  ^  |  rj  <  Tj] 


(Fix) 


a  :  ANj^_R  I  i  €  1-n] 

fix{a)  :  A N  |  a*  >  t*] 


(Curry) 


a  :  A[N  x  a-)  -»  Tj  [  i  G  l..n] 
curry(a )  :  AN  I  *  G  l..n] 


(Uncurry) 


(Pair) 


a  •  AN~*N~*"A  |  [  G  l..n] 
uncurry(a)  :  A[N  x  cr()-»Tj  |  j£  l..n] 

o  :  AN~+Tj  |  *  €  l..n]  6  :  AN  ^  Cj  I  j  G  l-.m] 

<  a,  b>  :  AN  “K7*  x  Ci)  I  <  CTi] 


(Prod) 


q  j  A[n-rn]  frjjMhN'J 

(a,  6)  :  Ah  XT'  |  i  G  l..n,  j  G  l..m] 


A.2.3  Polymorphic  type  inference 

(Gen) 

(Inst) 


e  :  a 
e  :  Vcc.  cr 

e  :  Va.  cr 
e  :  [r/a]<T 


a  :  {(Tj  ->•  Tj  |  *  G  l..n}  b  :  a 
a.b  :  Vh  I  <  cr*] 


(App- Over) 
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(Comp- Over) 
(Fix- Over) 
(Curry- Over) 

( U  ncurry-  Over) 


a:  {Ti  — >  6i  |  i  €  1  ..n}  b  :  a  — *  t' 
a\b  :  a^\J[5i  |  r'  <  Tj] 

a  :  {<Ji  — >  r i  |  i  £  l..n} 
fix  (a)  :  M[<Ji  |  Gi  >  Tj] 

a  :  {(gj  x  Oj)  -»  Tj  |  ig  l..n| 
curry(a)  :  {gj  — >  cr^  — >  Tj  |  i  £  l..n} 

a  :  {(Tj  — >■  — >  Tj  |  i  €  l..n} 

uncurry(a )  :  {(gj  x  g^)->Tj  |  i  £  l..n} 


A. 2.4  Refinement  types 

(Ref-Refl) 

(Ref- Sub) 

(Ref- And) 
(Ref-Arrow) 

(Ref-Prod) 


r  C  t 

*t<;t 

g  C  r 

Aki-^n]  E  A[n»Tm]  Vi  3j.  gj  C  Tj 

a  i  Cg2  ti  □  t2 

gi  C  g2  Ti  C  r2 
x  R  E  <12  x  T2 
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CDSO  and  CDSP  Algorithms 


B.l  left_min 

let  left_min_rec  =  algo 
request  {}B  do 

output  valof  (B.l) 
end 

request  {(B.1)=0}B  do 
output  output  0 
end 

request  {(B.1)=1}B  do 
output  valof  (B.2) 
end 

request  {(B. 1)=1, (B.2)=0}B  do 
output  output  0 
end 

request  { (B . 1)=1 , (B. 2)=1}B  do 
output  output  1 
end 

request  {}((B.$V).s)  do 
valof  {}(B.$V)  is 

valof  ((B.$V).l)  :  output  valof  ( ( (B. $V) . s) . 1) 
end 
end 

request  {(((B.$V)  .s)  .  1)=0}((B.$V)  .s)  do 
from  {{>(B.$V)=valof  ((B.$V).l)}  do 
valof  {( (B. $V) . 1)=0}(B. $V)  is 
output  0  :  output  output  0 
end 
end 
end 

request  {(((B.$V) .s) . 1)=1}((B.$V) .s)  do 
from  {{} (B . $V) =valof  ((B.$V).1)>  do 
valof  {((B.$V).1)=1}(B.$V)  is 

valof  ((B.$V).2)  :  output  valof  ( ( (B. $V) . s) . 2) 
end 
end 
end 

request  {(((B.$V)  .s)  .  1)=1,  (((B.$V)  .s)  .2)=0}( (B. $V)  .s)  do 
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from  {{} (B . $V) =valof  ((B.$V) . 1) ,{((B.$V) .l)=l>(B.$V)=valof  ((B.$V).2)>  do 
valof  {((B.$V) . 1)=1 , ( (B. $V) .2)=0}(B.$V)  is 
output  0  :  output  output  0 
end 
end 
end 

request  {(((B.$V) .s) . 1)=1, (((B.$V) .s) .2)=1>((B.$V) .s)  do 

from  {{} (B . $V) =valof  ( (B . $V) . 1) ,{( (B. $V) . 1)=1}(B. $V)=valof  ((B.$V).2)}  do 
valof  {((B.$V).1)=1,((B.$V).2)=1}(B.$V)  is 
output  1  :  output  output  1 
end 
end 
end 
end; 

let  left_min  =  f ix(left_min_rec) ; 


B.2  min 

let  min_rec  =  algo 
request  {>B  do 

output  query  {(B.l),  (B.2)}  is 
{0,  _}  =>  output  0 
{_ ,  0}  =>  output  0 
{1,  1}  =>  output  1 
end 

end 

request  {}((B.$V).s)  do 
valof  1}(B.$V)  is 
query  {((B.$V).l),  ((B.$V).2)>  is 
{0,  _}  =>  output  0 
{_ ,  0}  =>  output  0 
{1,  1}  =>  output  1 

end  :  output  query!  ( ( (B . $V) . s) . 1) ,  (((B.$V) .s) .2)}  is 
{0,  _}  =>  output  0 

0}  =>  output  0 

11,  1}  =>  output  1 

end 

end 

end 

end; 

let  min  =  f ix(min_rec) ; 

B.3  CDSO  algorithms  used  to  compile  PCF 

We  list  the  base  environment  of  CDSO  dcds  declarations  and  algorithms  which  are  used  to  compile 
PCF  programs.  There  are  two  versions  of  this  base  environment:  one  defines  refinements  of  bool 
and  intlist,  and  the  other  does  not.  We  show  the  code  for  the  refined  version. 

(*  Constants  that  are  part  of  the  PCF  environment  *) 

(*  Automatically  loaded  in  when  user  switches  to  *) 
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(*  PCF  interpreter.  *) 

(*  The  basic  types  *) 

let  bool  =  dcds  cell  B  values  tt,ff  end; 
let  int  =  dcds  cell  N  values  [..]  end; 

(*  The  refinement  types  *) 

let  true  =  dcds  cell  B  values  tt  end; 
let  false  =  dcds  cell  B  values  ff  end; 
refine  true,  false; 

(*  The  primitive  operations  *) 

(*  cond  :  ((bool  *  *a)  *  >  a)  ->  ’a  *) 
let  cond  = 
algo 

request  $C  do 

valof  ((B.l).l)  is 

tt:  valof  ( ($C. 2) . 1)  is 
$V :  output  $V 
end 

ff:  valof  ($C. 2)  is 
$W :  output  $W 
end' 

end 

end 

end; 

(*  fst  :  (’a  *  ’b)  ->  ’ a  *) 
let  fst  =  algo 

request  $C  do 

valof  ($C.l)  is 
$V :  output  $V 

end 

end 

end; 

(*  snd  :  (’a  *  ’b)  ->  ’a  *) 
let  snd  =  algo 
request  $C  do 

valof  ($C.2)  is 
$W:  output  $W 

end 

end 

end; 

(*  plus  :  (int  *  int)  ->  int  *) 
let  plus  =  algo 
request  N  do 

valof  (N.l)  is 

$V1:  valof  (N.2)  is 
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$V2:  output  $V1  +  $V2 

end 

end 

end 

end; 

let  minus  =  algo 
request  N  do 

valof  (N.l)  is 

$V1:  valof  (N . 2)  is 

$V2:  output  $V1  -  $V2 

end 

end 

end 

end; 

let  times  =  algo 
request  N  do 

valof  (N.l)  is 

$V1:  valof  (N.2)  is 

$V2 :  output  $V1  *  $V2 

end 

end 

end 

end; 

let  div  =  algo 
request  N  do 

valof  (N.l)  is 

$V1:  valof  (N.2)  is 

$V2:  output  $V1  /  $V2 

end 

end 

end 

end; 

(*  equal  :  (int  *  int)  ->  bool  *) 
let  equal  =  algo 
request  B  do 

valof  (N.l)  is 

$V1:  valof  (N.2)  is 

$V2  with  $V2  =  $V1:  output  tt 
$V2  with  $V2  \-  $V1:  output  ff 

end 

end 

end 

end; 

let  less  =  algo 
request  B  do 

valof  (N.l)  is 

$V1:  valof  (N.2)  is 

$V2  with  $V2  >  $V1:  output  tt 
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$V2  with  $V2  <=  $V1:  output  ff 

end 

end 

end 

end; 

let  grtr  =  algo 
request  B  do 

valof  (N.l)  is 

$V1:  valof  (N.2)  is 

$V2  with  $V2  <  $V1:  output  tt 
$V2  with  $V2  >=  $V1:  output  ff 

end 

end 

end 

end; 

let  leq  =  algo 
request  B  do 

valof  (N.l)  is 

$V1 :  valof  (N . 2)  is 

$V2  with  $V2  >=  $V1:  output  tt 
$V2  with  $V2  <  $V1:  output  ff 

end 

end 

end 

end; 

let  geq  =  algo 
request  B  do 

valof  (N.l)  is 

$V1:  valof  (N.2)  is 

$V2  with  $V2  <=  $V1:  output  tt 
$V2  with  $V2  >  $V1:  output  ff 

end 

end 

end 

end; 

(*  and  :  (bool  *  bool)  ->  bool  *) 
let  land  = 
algo 

request  B  do 
valof  (B.l)  is 

tt:  valof  (B.2)  is 
tt :  output  tt 
f f :  output  f f 
end 

f f :  output  f f 
end 
end 
end; 
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let  lor  = 
algo 

request  B  do 
valof  (B.l)  is 
tt :  output  tt 
ff:  valof  (B.2)  is 
tt :  output  tt 
ff:  output  ff 
end 

end 

end 

end; 

(*  Now  things  that  are  not  explicitly  in  the  language  but  are  needed  *) 
(*  in  the  translation  to  categorical  combinators .  *) 

(*  id  :  ?a  ->  'a  *) 
let  id  =  algo 

request  $C  do 
valof  $C  is 

$V  :  output  $V 

end 

end 

end; 

(*  The  empty  environment  *) 
let  emptyenv  =  {}; 

(*  the  "regular"  fixpoint,  Y  :  (Ja  ->  Ja)  ->  ’a  *) 

(*  Y  =  fix  (fn  f  =>  fn  x  =>  x  (f  x))  *) 

let  Y  =  fix ((curry (curry (uncurry (id)  | 

<snd,  uncurry (id)  I  <snd|fst,  snd>>) )). emptyenv) ; 

(*  the  "environment"  fixpoint,  Yenv  :  (env  ->  ’a  ->  ’a)  ->  env  ->  ?a  *) 
let  Yenv  =  curry(Y  I  uncurry (id) ) ; 

(*  Integer  lists  *) 

letrec  intlist  =  dcds 

cell  EMPTY  values  true,  false 
graft  (int.l)  access  EMPTY  =  false 
graft  (intlist. 1)  access  EMPTY=false 
end; 

(*  refined  types  *) 

let  empty_intlist  =  dcds 
cell  EMPTY  values  true 
end; 

let  one_intlist  =  dcds 
cell  EMPTY  values  false 

cell  (N.l)  values  [..]  access  EMPTY  =  false 


B.3.  CDSO  ALGORITHMS  USED  TO  COMPILE  PCF 


157 


cell  (EMPTY. 1)  values  true  access  EMPTY  =  false 
end; 

local  letrec  part ial_int list  =  dcds 

cell  (EMPTY. 1)  values  true,  false  access  EMPTY  =  false 
cell  (N.l)  values  [. .]  access  EMPTY  =  false 
graft  (partial_intlist .1)  access  EMPTY  =  false 
end 

in  let  many_intlist  =  dcds 
cell  EMPTY  values  false 

cell  (N.l)  values  [. .]  access  EMPTY  =  false 
cell  (EMPTY. 1)  values  false  access  EMPTY  =  false 
cell  ((N.l).l)  values  [..]  access  (EMPTY. 1)  =  false 
graft  (partial_intlist . 1) 
end 
end; 

refine  empty_intlist ,  one_intlist,  many_intlist ; 

let  nil  =  {EMPTY  =  true}; 

let  null  =  algo 
request  B  do 
valof  EMPTY  is 
true  :  output  tt 
false  :  output  ff 
end 
end 
end; 

let  cons  =  algo 
request  EMPTY  do 
output  false 
end 

request  (N.l)  do 
valof  (N.l)  is 
$V  :  output  $V 
end 
end 

request  (EMPTY. 1)  do 
valof  (EMPTY. 2)  is 
$B  :  output  $B 
end 
end 

request  (( (EMPTY. $T) . 1) . 1)  do 
from  { ( (EMPTY . $T) . 2) =f alse}  do 
valof  (( (EMPTY. $T) .1) .2)  is 
$B  :  output  $B 
end 
end 
end 

request  (((N.$T) .1) .1)  do 

from  {( (EMPTY. $T) .2) =f alse}  do 
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valof  ( ( (N. $T) . 1) . 2)  is 
$V  :  output  $V 
end 
end 
end 
end; 

let  hd  =  algo 
request  N  do 
valof  EMPTY  is 

false  :  valof  (N.l)  is 

$V  :  output  $V 

end 

end 

end 

end; 

let  tl  =  algo 

request  (EMPTY. $T)  do 

from  { (EMPTY. $T)=false>  do 
valof  ( (EMPTY. $T) .1)  is 
$B  :  output  $B 
end 
end 
end 

request  ((N.$T).l)  do 

from  {(EMPTY. $T)=false,  ( (EMPTY . $T) . l)=false}  do 
valof  ( ( (N. $T) .1) . 1)  is 
$V  :  output  $V 
end 
end 
end 
end; 


B.4  Types  for  CDSO  algorithms  in  base  environment 

-  cdsO(ref ined) ; 

—  Loading  PCF  constants. 

Type  bool  defined. 

Type  int  defined. 

Type  true  defined. 

Type  false  defined. 

r:  /\ [( (false  *  ’a)  *  ’b)  ->  ’b,  ((true  *  ’c)  *  Jd)  ->  >c] 

:  ((bool  *  ’a)  *  ’ a)  ->  Ja 
Abbreviation  "cond"  defined, 
r:  ( ’ a  *  >b)  ->  Ja 
:  (>a  *  >b)  ->  >a 
Abbreviation  "fst"  defined, 
r:  ( ’a  *  ’b)  ->  >b  .  / 

:  ( ? a  *  ’b)  ->  Jb 
Abbreviation  "snd"  defined, 
r:  (int  *  int)  ->  int 
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:  (int  *  int)  ->  int 
Abbreviation  "plus"  defined, 
r:  (int  *  int)  ->  int 
:  (int  *  int)  ->  int 
Abbreviation  "minus"  defined, 
r:  (int  *  int)  ->  int 
:  (int  *  int)  ->  int 
Abbreviation  "times"  defined, 
r:  (int  *  int)  ->  int 
:  (int  *  int)  ->  int 
Abbreviation  "div"  defined, 
r:  (int  *  int)  ->  bool 
:  (int  *  int)  ->  bool 
Abbreviation  "equal"  defined, 
r:  (int  *  int)  ->  bool 
:  (int  *  int)  ->  bool 
Abbreviation  "less"  defined, 
r:  (int  *  int)  ->  bool 
:  (int  *  int)  ->  bool 
Abbreviation  "grtr"  defined, 
r:  (int  *  int)  ->  bool 
:  (int  *  int)  ->  bool 
Abbreviation  "leq"  defined, 
r:  (int  *  int)  ->  bool 
:  (int  *  int)  ->  bool 
Abbreviation  "geq"  defined. 

r:  /\[ (false  *  bool)  ->  false,  (true  *  true)  ->  true,  (true  *  false)  ->  false] 
:  (bool  *  bool)  ->  bool 
Abbreviation  "land"  defined. 

r:  /\[ (false  *  false)  ->  false,  (true  *  bool)  ->  true,  (false  *  true)  ->  true] 
:  (bool  *  bool)  ->  bool 
Abbreviation  "lor"  defined, 
r:  Ja  ->  ?a 
:  ’a  ->  *a 

Abbreviation  "id"  defined, 
r:  ’a 
:  ’a 

Abbreviation  "emptyenv"  defined, 
r:  ( ?a  ->  >a)  ->  Ja 
:  (’a  ->  ’a)  ->  ’a 
Abbreviation  "Y"  defined, 
r:  (’a  ->  >b  ->  >b)  ->  >a  ->  >b 
:  (’a  ->  Jb  ->  ’b)  ->  >a  ->  Jb 
Abbreviation  "Yenv"  defined. 

Type  intlist  defined. 

Type  empty. intlist  defined. 

Type  one.intlist  defined. 

Type  many. intlist  defined, 
r:  empty. intlist 
:  intlist 

Abbreviation  "nil"  defined. 

r:  /\[one.intlist  ->  false,  empty.intlist  ->  true,  many. intlist  ->  false] 

:  intlist  ->  bool 
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Abbreviation  "null"  defined. 

r:  /\[(int  *  one_intlist)  ->  many.intlist ,  (int  *  many. int list)  ->  many.intlist , 
(int  *  empty. int list)  ->  one.intlist] 

:  (int  *  intlist)  ->  intlist 
Abbreviation  "cons"  defined. 

r:  /\ [one.intlist  ->  int,  many.intlist  ->  int] 

:  intlist  ->  int 
Abbreviation  "hd"  defined. 

r:  /\ [many.intlist  ->  intlist,  one.intlist  ->  empty. intlist] 

:  intlist  ->  intlist 
Abbreviation  "tl"  defined. 

CDSO  version  1.1  -  June  11,  1997 

# 


B.5  ANDJTASTER 

#  let  AND. TASTER  = 
algo 

request  WHICH. AND  do 
valof  {}B  is 

output  tt:  output  IS.N0T_AN.AND 
output  ff:  output  IS.N0T_AN.AND 
valof  (B . 1) : 

valof  {(B.l)=tt}B  is 

output  tt:  output  IS.NOT.AN.AND 
output  f f :  output  IS.NOT.AN.AND 
valof  (B. 2) : 

valof  {(B. l)=tt , (B.2)=tt}B  is 
output  f f :  output  IS.NOT.AN.AND 
output  tt : 

valof  { (B. l)=tt , (B.2)=ff }B  is 
output  tt:  output  IS.NOT.AN.AND 
output  ff: 

valof  {(B.l)=ff}B  is 

output  tt:  output  IS.NOT.AN.AND 
output  f f :  output  IS.LEFT.AND 
valof  (B.2): 

valof  {(B. l)=ff , (B.2)=tt}B  is 
output  tt:  output  IS.NOT.AN.AND 
output  f f : 

valof  {(B.l)=ff ,(B.2)=ff}B  is 
output  tt:  output  IS.NOT.AN.AND 
output  ff:  output  IS.LEFT_STRICT.AND 
end 

end 

end 

end 

end 

end 

valof  (B.2): 

valof  {(B.2)=tt}B  is 
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output  tt:  output  IS_NOT_AN_AND 
output  ff:  output  IS_NOT_AN_AND 
valof  (B.l): 

valof  {(B.2)=tt , (B. l)=tt}B  is 
output  f f :  output  IS_N0T_AN_AND 
output  tt: 

valof  { (B . 2)=tt , (B . l)=f f }B  is 
output  tt:  output  IS„NOT_AN_AND 
output  f f : 

valof  { (B. 2)=f f }B  is 

output  tt:  output  IS_NOT_AN„AND 
output  ff:  output  IS_RIGHT_AND 
valof  (B.l): 

valof  l(B.2)=ff , (B. l)=tt}B  is 
output  tt:  output  IS_NOT_AN_AND 
output  f f : 

valof  {(B.2)=ff , (B. l)=ff}B  is 
output  tt:  output  IS_NOT_AN_AND 
output  ff:  output  IS_RIGHT_STRICT_AND 
end 

end 

end 

end 

end 

end 

end 

end 

end; 

r:  /\[/\ [(false  *  false)  ->  false,  (true  *  true)  ->  true, 

(true  *  false)  ->  false,  (false  *  true)  ->  false]  ->  is_and_type, 
((bool  *  bool)  ->  true)  ->  is_not_and_type , 

/\ [(false  *  false)  ->  true,  (true  *  true)  ->  true, 

(true  *  false)  ->  false,  (false  *  true)  ->  false]  ->  is_not_and_type, 
((bool  *  bool)  ->  false)  ->  is_not_and_type , 

/\[(true  *  false)  ->  true,  (true  *  true)  ->  true, 

(false  *  true)  ->  false]  ->  is_not_and_type, 

((true  *  bool)  ->  true)  ~>  is_not_and_type, 

/\[(bool  *  false)  ->  false,  (true  *  true)  ->  true, 

(false  *  true)  ->  false]  ->  is_and_type, 

((true  *  bool)  ->  false)  ->  is_not_and_type, 

/\[(bool  *  false)  ->  true,  (true  *  true)  ->  true, 

(false  *  true)  ->  false]  ->  is_not_and_type , 

((true  *  true)  ->  false)  ->  is_not_and__type , 

/\[ (false  *  true)  ->  true,  (true  *  true)  ->  true]  ->  is_not_and_type, 
/\[(true  *  false)  ->  true,  (true  *  true)  ->  true]  ->  is_not_and_type, 
((bool  *  true)  ->  false)  ->  is_not_and_type, 

/\[ (false  *  bool)  ->  true,  (true  *  true)  ->  true, 

(true  *  false)  ->  false]  ->  is_not_and_type, 

((bool  *  true)  ->  true)  ->  is_not_and_type, 

/\[ (false  *  bool)  ->  false,  (true  *  true)  ->  true, 

(true  *  false)  ->  false]  ->  is_and_type, 

/\[ (false  *  false)  ->  false,  (true  *  true)  ->  true, 
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(false  *  true)  ->  false,  (true  *  false)  ->  false]  ->  is_and_type, 

/\ [(false  *  true)  ->  true,  (true  *  true)  ->  true, 

(true  *  false)  ->  false]  ->  is_not_and_type , 

/\ [(false  *  false)  ->  true,  (true  *  true)  ->  true, 

(false  *  true)  ->  false,  (true  *  false)  ->  false]  ->  is_not_and_type] 
:  ((bool  *  bool)  ->  bool)  ->  and_type 
Abbreviation  "AND_TASTER"  defined. 


Appendix  C 


CDSO  and  PCF  Syntax 


We  present  the  syntax  from  our  implementation  of  CDSO  and  PCF  in  form  similar  to  ML-Yacc 
input  (semantic  actions  being  omitted).  Names  in  capitals  are  terminals,  all  lower  case  are  non¬ 
terminals.  Where  there  is  the  possibility  for  confusion,  we  have  placed  single  quotes  around  a 
symbol. 


C.l  CDSO  syntax 

prog  : 

I  expr 
I  command 
I  dcds_decla 

(*  Expressions — begin  *) 
expr  :  {  event _list  > 

I  algo.decl 
I  CURRY  (  expr  ) 

I  UNCURRY  (  expr  ) 

I  expr  ( | ’  expr 

I  expr  .  expr 

I  <  expr  ,  expr  > 

1  (  expr  ,  expr  ) 

I  FIX  (  expr  ) 

I  (  expr  ) 

I  IP  • 

(*  Expressions — end  *) 

(*  Commands — begin  *) 
command  :  LET  ID  =  expr 
I  PRINT  ID 
I  LOAD  FILE 
I  LOADECHO  FILE 
I  TRACE  ON 
I  TRACE  OFF 
I  TIMER  ON 
I  TIMER  OFF 
I  TYPING  ON 
I  TYPING  OFF 
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|  SHOW  INTEGER  ID 
I  SHOW  MORE  INTEGER  ID 
|  HIERARCHY  FILE 
I  ENV 
I  PCF 

(*  Commands — end  *) 

(*  Dcds  declaration — begin  *) 
dcds_decla  :  LETREC  dcds_decl 
I  LET  dcds_decl 

dcds_decl  :  ID  =  DCDS  component  END 
component  : 

I  CELL  cell_name  VALUES  value.list  access_list  component 
I  GRAFT  cell.name  access_list  component 

cell_name  :  ID 
I  VAR 

I  {  event_list  }  cell.name 
I  (  cell_name  .  tag  ) 

tag  :  ID 

I  arexpr 
|  interval 

value  :  ID 

I  VALOF  cell_name 
I  OUTPUT  value 
I  axexpr 

I  (  value  .  value  ) 

|  VAR  WITH  boolexp 
I  interval 

interval  :  [  .  .  ] 

|  [  int  .  .  ] 

I  [  .  .  int  ] 

I  [  int  .  .  int  ] 

int  :  INTEGER 
I  ~  INTEGER 

value^list  :  value 

I  value  ,  value.list 

access_list  : 

I  ACCESS  enabling 

enabling  :  event_list 

I  event_list  OR  enabling 

event_list  : 

I  event 
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I  event  ,  event. list 

event  :  cell.name  =  value 
(*  Dcds  declaration — end  *) 

(*  Arithmetic  expression — begin  *) 
arexpr  :  int 
I  VAR 
I  ~  VAR 
I  ~  (  arexpr  ) 

I  arexpr  PLUS  arexpr 
I  arexpr  SUB  arexpr 
I  arexpr  MULT  arexpr 
I  arexpr  DIV  arexpr 
I  (  arexpr  ) 

(*  Arithmetic  expression — end  *) 

(*  Boolean  expressions — begin  *) 
boolexp  :  arexpr  >  arexpr 
I  arexpr  >=  arexpr 
I  arexpr  <  arexpr 
I  arexpr  <=  arexpr 
|  value  =  value 
I  value  !=  value 
I  boolexp  OR  boolexp 
I  boolexp  AND  boolexp 
I  (  boolexp  ) 

(*  Boolean  expressions — end  *) 

(*  Algorithm  declaration — begin  *) 
algo.decl  :  ALGO  body.list  END 

body.list  : 

I  body.list  body 

body  :  REQUEST  ext .cell.name  DO  instruction  END 

ext_cell_name  :  cell.name 

I  cell.name  WITH  boolexp 

instruction  :  OUTPUT  value 

I  VALOF  cell.name  IS  query. list  END 
I  from.do.list 
I  OMEGA 

from.do.list  :  from.do 

I  from.do.list  from.do 

from.do  :  FROM  {  event.list  }  DO  instruction  END 
query  :  value  i : 9  instruction 


query.list  : 
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I  query_list  query 
(*  Algorithm  declaration — end  *) 


C.2  PCF  syntax 


program  : 

I  expr 

I  VAL  ID  =  expr 
I  LOAD  FILE 
|  PRINT  ID 
I  QUIT 


expr 


TRUE 

FALSE 

int 

ID 

expr  expr 

FN  ID  =>  expr 

LET  ID  =  expr  IN  expr  END 

LETREC  ID  =  expr  IN  expr  END 

bop 

IF  expr  THEN  expr  ELSE  expr 
(  expr  ,  expr  ) 

FST  expr 
SND  expr 
expr  : :  expr 
HD  expr 
TL  expr 
[  ] 

NULL  expr 
(  expr  ) 


int  :  INTEGER 
|  ~  INTEGER 


bop  :  expr  +  expr 
I  expr  -  expr 
I  expr  *  expr 
I  expr  /  expr 
I  expr  =  expr 
I  expr  <  expr 
I  expr  >  expr 
I  expr  <=  expr 
I  expr  >=  expr 
I  expr  AND  expr 
I  expr  OR  expr 
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